Python Forum

Full Version: Detect end of line in text file including line breaks
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi folks,

I'm new to Python and trying several little projects to get into the languange.
I have a text file which was exported from a WhatsApp chat group and I try to convert this file line by line to a csv format:

23.11.19, 21:35 - Person A: ndsnldkl odj aso saod saodjd ad?
23.11.19, 21:43 - Person B: nsidd dsidaojd eduq dsajojdipajd adapsd??
23.11.19, 21:44 - Person C: ahush asaosi0sj a0s9uaS SJs !!

The output is a csv format:
23.11.19;21:35;Person A;ndsnldkl odj aso saod saodjd ad?

Thats my (newbie) code:
liste = []
for x in f:
    date = x[:8]
    time = x[10:15]
    pos1 = x.find(":",18,-1)
    name = x[18:pos1]
    message = x[pos1+2:-1]
    liste.append(date)
    liste.append(time)
    liste.append(name)
    liste.append(message)
It works great until I get messages like this with line break:
29.11.19, 15:54 - Person D: ndadad saojd sapods dsap.
ksjd sad aslajd a


The "message = x[pos1+2:-1]" does not recognize the line break.
I had written this code in the past with PHP that works nearly the same, but PHP recognizes the full message including the line breaks.

Any idea how I can fix that?

Greetings,
Daniel
The problem is with the for x in f which reads the file line by line and not message by message.
(Dec-18-2019, 09:27 AM)Gribouillis Wrote: [ -> ]The problem is with the for x in f which reads the file line by line and not message by message.

Hi, thanks, is there a better solution for reading the file?
you need to check if the line starts with pattern dd.mm.yy, hh:mm - .
If it does - it's a new message. if not - it's just a new line in the previous message (i.e. append to previous one).
Don't know what the exported file looks like in case of quotes or if (after a new line) message text starts with the above pattern
You can better use regex and there you can include new line/break for message text