Python Forum
Extract PDF Attachment from Gmail
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extract PDF Attachment from Gmail
#1
I would like to have a python script open my Gmail inbox, read each unread email, extract the PDF attachments if they exist and save the attachment. My code below was published on a couple forums and websites. It kind of works in that it opens my email account, only processes new emails and then changes their status to READ. The code currently doesn’t find the attachment. The code returns “text/plain” instead of “application/pdf”. I’ve uncommented the “part.get_content_type() == ‘text/plain’” line and it falls through that “if” statement code like it should but doesn’t match the “application/pdf” line when it is uncommented. The pdf file is a simple one line file that was converted to PDF using the EXPORT feature in Google Drive. I’ve also used pdf files that were created by a company that sends out pdf invoices with the same results. I’ve not installed and used the PyPDF2 library because I’m told that the standard Python libraries should work. I’m thinking it’s related to encryption. Any help is much appreciated. Thanks in advance!

import imaplib
import email
import regex
import re

user = 'user'
password = 'kfkgxpzir'

server = imaplib.IMAP4_SSL('imap.gmail.com')
server.login(user, password)
server.select('inbox')

msg_ids=[]
resp, messages = server.search(None, 'UNSEEN')
for message in messages[0].split():
        typ, data = server.fetch(message, '(RFC822)')
        msg= email.message_from_string(str(data[0][1]))
        #looking for 'Content-Type: application/pdf
        for part in msg.walk():
           #if part.get_content_type() == 'text/plain':
           if part.get_content_type() == 'application/pdf':
              print("Found pdf")
              payload = part.get_payload(decode=True)

              filename = part.get_filename()
              print(filename)

              # Save the file.
              if payload and filename:
                  with open(filename, 'wb') as f:
                      f.write(payload)
Larz60+ write Sep-10-2023, 11:22 PM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
I added the tags for youthis time. Please use BBCode tags on future posts.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Unable to download TLS Report attachment blason16 6 555 Feb-26-2024, 07:36 AM
Last Post: Pedroski55
  I get attachment paperclip on email without any attachments monika_v 5 2,004 Mar-19-2022, 10:20 PM
Last Post: cosmarchy
  Trying to determine attachment file type before saving off.. cubangt 1 2,169 Feb-23-2022, 07:45 PM
Last Post: cubangt
  Clicking Every Page and Attachment on Website shoulddt 1 1,752 Jul-09-2021, 01:08 PM
Last Post: snippsat
  Gmail Sent email LIMIT Johnse 5 7,141 Apr-28-2021, 10:19 PM
Last Post: snippsat
  ATT00001 instead of PDF attachment to an email dominiklucas 0 3,777 Jul-25-2020, 04:05 PM
Last Post: dominiklucas
  Sending an email with attachment without using SMTP? PythonNPC 5 3,214 May-05-2020, 07:58 AM
Last Post: PythonNPC
  download base64 email attachment cferguson 3 4,739 Feb-25-2020, 06:50 PM
Last Post: cferguson
  extract email addresses from gmail vigneshboolog 0 1,777 Feb-11-2020, 09:23 AM
Last Post: vigneshboolog
  Add png to the body of the email, not the attachment Ivan87 8 5,961 Dec-14-2018, 11:05 AM
Last Post: wavic

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020