Python Forum
More elegant way to remove time from text lines.
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
More elegant way to remove time from text lines.
#1
I have 10 text files and 10 mp3s

The text files have some lines with no time headers, but most lines look like this:

[11:29.53]Good morning, everyone.

I want all the time cues, like [11:29.53] gone, so I just have the text but no time cues.

I did it like this below, but I think it can be done more elegantly.

Any tips please?

#! /usr/bin/python3
# tidy up text copied from Topway English CD
import os

path = '/home/pedro/Documents/topway/'
files = os.listdir(path)
for f in files:
    print('Files are:', f)
    
file = input('What file are we looking for? Copy and paste 1 file here ... ')

textLoad = open(path + file)
textLoadData = textLoad.readlines()
textLoad.close()

newData = []

for line in textLoadData:
    if line[0] == '[':
        aLineCut = line[10:]
        newData.append(aLineCut)

preparedText = ''.join(newData)
newFile = open(path + file + '_timeless', 'w')
newFile.write(preparedText)
newFile.close()

print('ALL DONE! File saved as ' + path + file + '_timeless')
Reply
#2
Some ways.
>>> s = '[11:29.53]Good morning, everyone'
>>> s.partition(']')[2]
'Good morning, everyone'
>>> import re
>>> 
>>> re.sub(r'\[.*]', '', s)
'Good morning, everyone'

textLoadData A song at right moment The PEP 8 Song🎵

f-string
print('ALL DONE! File saved as ' + path + file + '_timeless')
print(f'ALL DONE! File saved as {path}{file}_timeless')
EkaCaesium and Pedroski55 like this post
Reply
#3
Thanks, that's much better!

I never heard of .partition, but I suppose .split(']') would do the same job. Didn't think of that!
Reply
#4
Why use Python at all? Command line tools like sed are made for this sort of thing. Regular expressions (aka "regex") as shown above are also worth knowing something about. An example with sed:

Output:
$ cat test [11:29.53]Good morning, everyone. $ sed -i 's/\[.*\]//' test $ cat test Good morning, everyone.
To explain a bit:

sed allows you to edit lines of text with various commands. The command that's being used here is s for "substitute". For each line in the file test, we substitute the thing between the first pair of / (that is, the string that matches the regular expression \[.*\] - the square brackets are meaningful in regex, hence the need to escape them) with the thing between the second pair (i.e. the empty string). The -i option, acts on the file in place (without it, the results are just printed to standard out, so you could redirect them to another file, if you wanted).

The Grymoire has a sed tutorial here.
Pedroski55 likes this post
Reply
#5
(Apr-25-2021, 12:03 AM)Pedroski55 Wrote: I never heard of .partition, but I suppose .split(']') would do the same job. Didn't think of that!
Yes,just to show partition that more rare to use.
>>> s = '[11:29.53]Good morning, everyone'
>>> s.split(']')[-1]
'Good morning, everyone'
So a more elegant solution would could be like this.
See that there is no readlines() or close() used.
Iterate of file-object and with open() will close it automatically.
So now only line bye line is read into memory and not the whole file.
with open("in.txt") as f, open('out.txt', 'w') as f_out:
    for line in f:
        line = line.split(']')[-1]
        f_out.write(line)
import re

with open("in.txt") as f, open('out.txt', 'w') as f_out:
    for line in f:
        line = re.sub(r'\[.*]', '', line)
        f_out.write(line)
Pedroski55 likes this post
Reply
#6
Thanks again!

@ Da Bishop:I have seen sed in action, but it seems so cryptic, only robots can understand it! I am not R2D2!

But thanks for the link to sed, I will see if I can use it sometime, somewhere. I already have trouble with re!

@snippsat: Thank you, that looks like I think it should look like, but I can't write!!

Very grateful to you both!!
Reply
#7
There is Python friendly version of sed - sd. So it can be written as:

> sd -p '\[.*\]' '' test
-p flag is preview i.e. it will not change the file but you can see the result. If it is as expected flag can be omitted and actual change made
Pedroski55 likes this post
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Want to remove the text from a particular column in excel shantanu97 2 261 Jul-05-2021, 05:42 PM
Last Post: eddywinch82
  how to connect mysql from txt 1 line goes good but not all lines in text kingceasarr 4 596 Mar-24-2021, 05:45 AM
Last Post: buran
  Assistance with running a few lines of code at an EXACT time nethatar 5 675 Feb-24-2021, 10:43 PM
Last Post: nilamo
  Remove Blank Lines from docx table and paragraphs bsudhirk001 1 643 Feb-14-2021, 12:38 AM
Last Post: Larz60+
  Split gps files based on time (text splitting) dervast 0 392 Nov-09-2020, 09:19 AM
Last Post: dervast
  Iterate 2 large text files across lines and replace lines in second file medatib531 13 1,563 Aug-10-2020, 11:01 PM
Last Post: medatib531
  Read Multiples Text Files get specific lines based criteria zinho 5 1,065 May-19-2020, 12:30 PM
Last Post: zinho
  How to detect the text above lines using OpenCV in Python pframe 0 973 Apr-14-2020, 09:53 AM
Last Post: pframe
  Highlight and remove specific string of text itsalmade 5 1,281 Dec-11-2019, 11:58 PM
Last Post: micseydel
  Can't seem to figure out how to delete several lines from a text file Cosmosso 9 1,535 Dec-10-2019, 11:09 PM
Last Post: Cosmosso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020