Python Forum
Unable to download TLS Report attachment - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Unable to download TLS Report attachment (/thread-41645.html)



Unable to download TLS Report attachment - blason16 - Feb-21-2024

Hi Team,

I am trying to write a code which would download the TLS Report messages and then extract messages, unzip it and store it in file. Since I am not pro I happen to write few lines but the attachments are not getting downloaded.
The code then says it would search for the messages when matching content found in BODY but those are throwing exception.
Can someone please help me here?

import imaplib
import email
import json
import os

# IMAP server credentials
IMAP_SERVER = 'imap.xxxx.net'
EMAIL_ADDRESS = '[email protected]'
PASSWORD = 'XXXXXXX'

# Directory to save JSON files
SAVE_DIR = '/tmp/tls'

def connect_to_imap_server():
    # Connect to the IMAP server
    mail = imaplib.IMAP4_SSL(IMAP_SERVER)
    mail.login(EMAIL_ADDRESS, PASSWORD)
    return mail

def download_tls_rpt_reports():
    # Create directory if it doesn't exist
    if not os.path.exists(SAVE_DIR):
        os.makedirs(SAVE_DIR)

    mail = connect_to_imap_server()
    mail.select("inbox")

    # Search for emails containing TLS-RPT reports
    result, data = mail.search(None, 'SUBJECT', 'Report')
    for num in data[0].split():
        result, data = mail.fetch(num, '(RFC822)')
        raw_email = data[0][1]
        msg = email.message_from_bytes(raw_email)

        # Extract JSON attachment (if any)
        for part in msg.walk():
            if part.get_content_type() == "application/json":
                filename = part.get_filename()
                if filename:
                    with open(os.path.join(SAVE_DIR, filename), 'wb') as f:
                        f.write(part.get_payload(decode=True))
                        print(f"Saved TLS-RPT report: {filename}")

    mail.close()
    mail.logout()

if __name__ == "__main__":
    download_tls_rpt_reports()
TIA
Blason R


RE: Unable to download TLS Report attachment - deanhystad - Feb-21-2024

Please post the entire error message and traceback.


RE: Unable to download TLS Report attachment - blason16 - Feb-22-2024

Well there is no traceback it just hangs without any output and I even waited for hours but no files are getting downloaded in that given directory

Like this

python3 tlsrpt.py


RE: Unable to download TLS Report attachment - blason16 - Feb-22-2024

Surprisingly it ended with no errors after about 1 hour but still no files are downloaded. Any clue how do I debug?


RE: Unable to download TLS Report attachment - Pedroski55 - Feb-25-2024

This works for me, downloads a small json file example3.json but it should collect any file, I think!

Probably better to only get files from people you know, or you might pick up some nasty stuff!

Strangely, when I do this step by step, I have to use:

Quote:typ, messageParts = imapSession.fetch(msgId[0], '(RFC822)') # msgId is a list

But, within myApp() this works fine! Can't figure that out!

Quote:typ, messageParts = imapSession.fetch(msgId, '(RFC822)')

import email
import imaplib
import os

savepath = '/home/pedro/myPython/email_stuff/'
imap_server = "imap.yourmail.com"
sender_of_interest = '[email protected]'
email_address = '[email protected]'
password = 'topsecret'
label = 'INBOX'

# make a folder to hold the files
if 'email_attachments' not in os.listdir(savepath):
    os.mkdir(savepath + 'email_attachments')

def myApp():
    try:
        imapSession = imaplib.IMAP4_SSL(imap_server)
        typ, accountDetails = imapSession.login(email_address, password)
        imapSession.select(label)
        typ, data = imapSession.search(None, 'UNSEEN', f'FROM {sender_of_interest}')
        print(typ, data)
        print('Search...')
        for msgId in data[0].split(): # msgId is a list I need msgId[0] when I do this step by step           
            typ, messageParts = imapSession.fetch(msgId, '(RFC822)')
            emailBody = messageParts[0][1]
            raw_email_string = emailBody.decode('utf-8')
            mail = email.message_from_string(raw_email_string)
            print('got the whole email body as a string...')
            for part in mail.walk():
                if part.get_content_maintype() == 'multipart':
                    print(part.as_string())
                    continue
                if part.get('Content-Disposition') is None:
                    print(f'This part has Content-Disposition: {part.as_string()}')
                    continue
                fileName = part.get_filename()
                print(f'file name {fileName} being processed ...')
                if bool(fileName):
                    filePath = os.path.join(savepath, 'email_attachments', fileName)
                    # don't overwrite if file exists can change this behaviour if wanted
                    if not os.path.isfile(filePath):
                        print(f'the file is {fileName}')
                        fp = open(filePath, 'wb')
                        fp.write(part.get_payload(decode=True))
                        fp.close()
                        print('fp closed ...')
        imapSession.close()
        imapSession.logout()
    except:
        print('Not able to download all attachments.')
Hope it works for you!

This does not mark the emails as SEEN, but you can add that if you wish.

For testing, I kept this mail as unread.


RE: Unable to download TLS Report attachment - Pedroski55 - Feb-25-2024

I modified your code a little, you seem to have repeated variable names which may have caused confusion.

I wrote Report in the email Subject field.

This still doesn't mark mails as SEEN, also it overwrites a file with the same name without hesitation.

Works for me!

import imaplib
import email
import os
 
# IMAP server credentials
savepath = '/home/pedro/myPython/email_stuff/'
imap_server = "imap.yourmail.com"
sender_of_interest = '[email protected]'
email_address = '[email protected]'
password = 'topsecret'
label = 'INBOX'
subject = 'Report'

# make a folder to hold the files
if 'email_attachments' not in os.listdir(savepath):
    os.mkdir(savepath + 'email_attachments')
 
def connect_to_imap_server(s, e, p):
    # Connect to the IMAP server
    mail = imaplib.IMAP4_SSL(s)
    mail.login(e, p)
    return mail
 
def download_tls_rpt_reports():
    M = connect_to_imap_server(imap_server, email_address, password )
    M.select(label) 
    # Search for emails containing TLS-RPT reports
    #typ, data = imapSession.search(None, 'UNSEEN', f'FROM {sender_of_interest}')
    result, data = M.search(None, 'UNSEEN', f'SUBJECT {subject}')
    for num in data[0].split():
        status, messageParts = M.fetch(num, '(RFC822)')
        emailBody = messageParts[0][1]
        raw_email_string = emailBody.decode('utf-8')
        msg = email.message_from_string(raw_email_string) 
        # Extract JSON attachment (if any)
        for part in msg.walk():
            if part.get_content_type() == "application/json":
                filename = part.get_filename()
                if filename:
                    filePath = os.path.join(savepath, 'email_attachments', filename)
                    with open(filePath, 'wb') as f:
                        f.write(part.get_payload(decode=True))
                        print(f"Saved TLS-RPT report: {filePath}") 
    M.close()
    M.logout()
 
if __name__ == "__main__":
    download_tls_rpt_reports()
Output:

Output:
download_tls_rpt_reports() Saved TLS-RPT report: /home/pedro/myPython/email_stuff/email_attachments/example2.json



RE: Unable to download TLS Report attachment - Pedroski55 - Feb-26-2024

As a bit of clarification, because this stuff puzzled me:

Basically, you want to get email_message from the module email as <email.message.EmailMessage object at 0x7fae30730ca0>

Do that 1 of 2 ways:

Quote:1. status, messageParts = M.fetch(num, '(RFC822)')
email_message = email.message_from_bytes(messageParts[0][1], policy=default)

gives:

Output:
email_message <email.message.EmailMessage object at 0x7fae30730ca0>
Quote:2. status, messageParts = M.fetch(num, '(RFC822)')
emailBody = messageParts[0][1]
raw_email_string = emailBody.decode('utf-8')
# this below makes msg = <email.message.Message object at 0x7fae31758af0>
# you can get this from directly email_message = email.message_from_bytes(messageParts[0][1], policy=default)
email_message = email.message_from_string(raw_email_string)

gives:

Output:
email_message <email.message.EmailMessage object at 0x7fae30730ca0>
Either way, you want email_message as, email.message.EmailMessage object, which is walkable.

Now you can get all the info you want!

Quote:email_message['To']
'[email protected]'
email_message['From']
'Pedro Rodriguez <[email protected]>'
email_message['Date']
'Sun, 25 Feb 2024 16:55:55 +0100'

To get the plain body text:

Quote:for part in email_message.walk():
if part.get_content_type() == "text/plain":
print(part.get_payload())

Output:
Here is the report you wanted.
To get html text:

Quote:for part in email_message.walk():
if part.get_content_type() == "text/html":
print(part.get_payload())


<div dir="ltr">Here is the report you wanted.</div>

I hope that helps!

PS:

policy comes from:

from email.policy import default