3

Quick python code to parse mbox files, specifically those used by GMail. Extract...

 1 year ago
source link: https://gist.github.com/benwattsjones/060ad83efd2b3afc8b229d41f9b246c4
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Quick python code to parse mbox files, specifically those used by GMail. Extracts sender, date, plain text contents etc., ignores base64 attachments. · GitHub

Instantly share code, notes, and snippets.

Quick python code to parse mbox files, specifically those used by GMail. Extracts sender, date, plain text contents etc., ignores base64 attachments.

row 68, suggestion: print('Parsing email {0} of {1}'.format(idx + 1, num_entries)) (idx fix)

The comment # ~*~ utf-8 ~*~ is useless; the default source code encoding for Python 3 is UTF-8 anyway, and if you wanted to communicate this fact to Emacs etc, the proper format uses dashes, not tildes, and a token coding: before the encoding name. See PEP-263.

Thanks! This helped me out. In parse_email(), would it make more sense to assign the email parts to instance variables? E.g.
self.email_from = self.email_data['From']
instead of
email_from = self.email_data['From']
Otherwise, how is a user of this class meant to access these?

@benwattsjones Brilliant, thanks for posting this!

@redcay yes, that or something like it is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK