Python Forum

Full Version: Parsing an MBOX file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have a client who wants to be able to parse and extract the message portions from an mbox (email) file. The mbox I have as an example has huge sections of what appears to be encrypted text. The code below extracts the text portions correctly, I think, but I'm not sure if the code is supposed to handle the seemingly-encrypted mbox text or if the mbox just has encrypted portions that can't be read.

Does the code below look correct to read/extract mbox data: to, from, subject, and body?

Thanks very much in advance,

-O

---


import os
import mailbox
import sys
import pprint

print("Reading emails:")

mbox_file = "/Users/oliver/Desktop/mbox"

print("Processing " + mbox_file)
mbox = mailbox.mbox(mbox_file)

for key in mbox.iterkeys():

    try:
        message = mbox[key]
    except mbox.errors.MessageParseError:
        continue  # The message is malformed. Just leave it.

    print("From: " + message['from'])
    print("To: " + message['to'])
    print ("Subject: " + str(message['Subject']))
    print("-----------------------------")
    print("Body\n")
    print (message)

    print("********************************************")
Looks good.