Python Forum

Full Version: Collect lines in a file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all,
If a line doesn't start with a timestamp , i need to add it with previous line through list.
this is the content of my file
9/16/16, 01:38 - User1: Hi , How you doing
9/16/16, 01:39 - User2: hi,
I'm good
How about you ?
Thanks for asking man
<media>
9/16/16, 02:02 - User3: Howdy folks!
9/16/16, 02:29 - User2: Awesome
9/16/16, 02:29 - User2: we are all good,
Thanks for asking
awesome
7/11/20, 13:00 - Me: <Video> watch this


Here is the code that I designed

intermediate =[]
finalData  = []
file = open('sample.txt', 'r' , encoding="utf8")
for lines in file:
    if startsWithDate(lines):
        intermediate .clear()
        intermediate .append(lines)
    else:
        intermediate .append(lines)
    intr = ' '.join(intermediate )
    finalData.append(intr)
print(finalData)
output obtained is
Output:
['9/16/16, 01:38 - User1: Hi , How you doing\n', '9/16/16, 01:39 - User2: hi,\n', "9/16/16, 01:39 - User2: hi,\n I'm good\n", "9/16/16, 01:39 - User2: hi,\n I'm good\n How about you ?\n", "9/16/16, 01:39 - User2: hi,\n I'm good\n How about you ?\n Thanks for asking man\n", "9/16/16, 01:39 - User2: hi,\n I'm good\n How about you ?\n Thanks for asking man\n <media>\n", '9/16/16, 02:02 - User3: Howdy folks!\n', '9/16/16, 02:29 - User2: Awesome \n', '9/16/16, 02:29 - User2: we are all good,\n', '9/16/16, 02:29 - User2: we are all good,\n Thanks for asking\n', '9/16/16, 02:29 - User2: we are all good,\n Thanks for asking\n awesome\n', '7/11/20, 13:00 - Me: <Video> watch this']
I don't need intermediate repeated outputs which are highlighted above.
Expected output is
Output:
['9/16/16, 01:38 - User1: Hi , How you doing\n', "9/16/16, 01:39 - User2: hi,\n I'm good\n How about you ?\n Thanks for asking man\n <media>\n", '9/16/16, 02:02 - User3: Howdy folks!\n', '9/16/16, 02:29 - User2: Awesome \n', '9/16/16, 02:29 - User2: we are all good,\n Thanks for asking\n awesome\n', '7/11/20, 13:00 - Me: <Video> watch this']
Please help me out
Collect all messages in intermediate and append to final data only when there is a timestamp (and there is data in intermediate)
intermediate = []
finalData = []
file = open('sample.txt', 'r', encoding="utf8")
for lines in file:
    if startsWithDate(lines) and intermediate:
        finalData.append(' '.join(intermediate))
        intermediate.clear()
    intermediate.append(lines)
Hello,
the problem is placing of line 11 (appending to finalData). This line happens in each iteration, regardless of whether line started with a date or not.
Apart from that, I would recommend you not to use "file" as a variable name, since it is a Python reserved keyword.
And when working with files, it is recommended to use "with" (context manager). You can read more here:
https://docs.python.org/3/tutorial/input...ting-files
Thanks! @mlieqo That really helps but I could not able to capture last item upon using your method.
Thanks! @j.crater. I didn't noticed.