Challenge with my string

SpencerH · Oct-06-2018, 11:11 PM

I am working on a script and I get an attachment emailed then check for the mail and open the attachment. So far I open it as a text string object from the email and it looks like this:

Content-Type: application/octet-stream;
	name="stockreport.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="stockreport.txt"
=EF=BB=BF73558
Lufthansa Technik
LTCS - H29&H30 - OSP Fiber
Location	Part#	Description	Curr Qty=09
M3B6	RIC-F-SA12-01	Fiber Bulkhead-SM/MM-6 Duplex-12 Adapters-ST-Black	25
M3B8	RIC-F-LCU24-01C	Fiber Bulkhead-SM-12 Duplex-24 =
Adapters-LC-Black/Blue	1

It can have any arbitrary number of items that will follow the same pattern: Location, Part#, Description, and Curr Qty. My goal from the example would be to end up with a list like this:

['73558', 'Lufthansa Technik', 'LTCS - H29&H30 - OSP Fiber', 'Location', 'Part#','Description', 'Curr Qty', 'M3B6', 'RIC-F-SA12-01', 'Fiber Bulkhead-SM/MM-6 Duplex-12 Adapters-ST-Black', '25', 'M3B8', 'RIC-F-LCU24-01C', 'Fiber Bulkhead-SM-12 Duplex-24 = Adapters-LC-Black/Blue', '1']

I'm thinking I can do my next part of processing easily once I have the list like this. The plan being to feed the data into a template for printing.

I have tried all sorts of splits but I feel stuck because none of them are quite right.

**j.crater** · (This post was last modified: Oct-06-2018, 11:23 PM by j.crater.)

M3B6    RIC-F-SA12-01   Fiber Bulkhead-SM/MM-6 Duplex-12 Adapters-ST-Black  25
M3B8    RIC-F-LCU24-01C Fiber Bulkhead-SM-12 Duplex-24 =
Adapters-LC-Black/Blue  1

Is there an error in the second line, should = be followed by Adapters-LC-Black/Blue 1?
The columns seem to be aligned. If this is true for every file you will parse, and the number of character "slots" allocated for a single column is constant, you could get the data you want by slicing and stripping the strings.
Edit.:
Ah that = ("potential error") might be there because otherwise the line is too long and columns arent aligned anymore in this case.

SpencerH · Oct-06-2018, 11:42 PM

Ok I had to do some digging and there are a few things that get added in when I "grab" things from the email.

I assume this is a line length thing.

On line 9 the "=09" at the end of Qty is extra.

On line 11 the "= " as you pointed out is extra.

I'm going to try and pull some other emails to see what kind of other odd things happen.

SpencerH · Oct-12-2018, 11:58 AM

It took me a while to figure out but I had to do a .decode() to get "clean" output.

I still have a challenge with the very first item has the encoding information from the document so it is always prefixed by '\ufeff'.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	PySpark Coding Challenge	cpatte7372	4	8,366	Jun-25-2023, 12:56 PM Last Post: prajwal_0078
	string format challenge	jfc	2	2,410	Oct-23-2021, 10:30 AM Last Post: ibreeden
	Very difficult challenge for me	cristfp	1	3,389	Apr-01-2019, 08:45 PM Last Post: Yoriz

Challenge with my string

User Panel Messages

Announcements