Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Challenge with my string
#1
I am working on a script and I get an attachment emailed then check for the mail and open the attachment. So far I open it as a text string object from the email and it looks like this:

Content-Type: application/octet-stream;
	name="stockreport.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="stockreport.txt"
=EF=BB=BF73558
Lufthansa Technik
LTCS - H29&H30 - OSP Fiber
Location	Part#	Description	Curr Qty=09
M3B6	RIC-F-SA12-01	Fiber Bulkhead-SM/MM-6 Duplex-12 Adapters-ST-Black	25
M3B8	RIC-F-LCU24-01C	Fiber Bulkhead-SM-12 Duplex-24 =
Adapters-LC-Black/Blue	1
It can have any arbitrary number of items that will follow the same pattern: Location, Part#, Description, and Curr Qty. My goal from the example would be to end up with a list like this:

['73558', 'Lufthansa Technik', 'LTCS - H29&H30 - OSP Fiber', 'Location', 'Part#','Description', 'Curr Qty', 'M3B6', 'RIC-F-SA12-01', 'Fiber Bulkhead-SM/MM-6 Duplex-12 Adapters-ST-Black', '25', 'M3B8', 'RIC-F-LCU24-01C', 'Fiber Bulkhead-SM-12 Duplex-24 = Adapters-LC-Black/Blue', '1']
I'm thinking I can do my next part of processing easily once I have the list like this. The plan being to feed the data into a template for printing.

I have tried all sorts of splits but I feel stuck because none of them are quite right.
Reply
#2
M3B6    RIC-F-SA12-01   Fiber Bulkhead-SM/MM-6 Duplex-12 Adapters-ST-Black  25
M3B8    RIC-F-LCU24-01C Fiber Bulkhead-SM-12 Duplex-24 =
Adapters-LC-Black/Blue  1
Is there an error in the second line, should = be followed by Adapters-LC-Black/Blue 1?
The columns seem to be aligned. If this is true for every file you will parse, and the number of character "slots" allocated for a single column is constant, you could get the data you want by slicing and stripping the strings.
Edit.:
Ah that = ("potential error") might be there because otherwise the line is too long and columns arent aligned anymore in this case.
Reply
#3
Ok I had to do some digging and there are a few things that get added in when I "grab" things from the email.

I assume this is a line length thing.

On line 9 the "=09" at the end of Qty is extra.

On line 11 the "= " as you pointed out is extra.

I'm going to try and pull some other emails to see what kind of other odd things happen.
Reply
#4
It took me a while to figure out but I had to do a .decode() to get "clean" output.

I still have a challenge with the very first item has the encoding information from the document so it is always prefixed by '\ufeff'.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  PySpark Coding Challenge cpatte7372 4 6,065 Jun-25-2023, 12:56 PM
Last Post: prajwal_0078
  string format challenge jfc 2 1,775 Oct-23-2021, 10:30 AM
Last Post: ibreeden
  Very difficult challenge for me cristfp 1 2,741 Apr-01-2019, 08:45 PM
Last Post: Yoriz

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020