Jan-27-2019, 12:41 PM
(Jan-27-2019, 10:21 AM)negru555 Wrote: How can I just the content in <div dir="auto"></div> Bassicaly the message only.Also if the possible the subject but not neccesary.Write your own parser,example BS for html and regex for rest.
from bs4 import BeautifulSoup import re email = '''\ MIME-Version: 1.0 Date: Sun, 27 Jan 2019 12:18:51 +0200 Message-ID: <CAK_qYVGw+UC7Ar9FhU+YRKQC8hXqs=tJzXW6_CvXsaJjzOJwSA@mail.gmail.com> Subject: Muie From: Wind Bullet <[email protected]> To: [email protected] Content-Type: multipart/alternative; boundary="0000000000000765de05806de3c2" --0000000000000765de05806de3c2 Content-Type: text/plain; charset="UTF-8" Muiescp --0000000000000765de05806de3c2 Content-Type: text/html; charset="UTF-8" <div dir="auto">Muiescp</div> --0000000000000765de05806de3c2-- Muiescp''' soup = BeautifulSoup(email, 'lxml') subject = re.search(r'Subject: (.*)', email).group(1) email_from = re.search(r'From: (.*)', email).group(1) message = soup.find('div') print(subject) print(email_from) print((message.text))
Output:Muie
Wind Bullet <[email protected]>
Muiescp