Python Forum
More elegant way to remove time from text lines.
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
More elegant way to remove time from text lines.
#1
I have 10 text files and 10 mp3s

The text files have some lines with no time headers, but most lines look like this:

[11:29.53]Good morning, everyone.

I want all the time cues, like [11:29.53] gone, so I just have the text but no time cues.

I did it like this below, but I think it can be done more elegantly.

Any tips please?

#! /usr/bin/python3
# tidy up text copied from Topway English CD
import os

path = '/home/pedro/Documents/topway/'
files = os.listdir(path)
for f in files:
    print('Files are:', f)
    
file = input('What file are we looking for? Copy and paste 1 file here ... ')

textLoad = open(path + file)
textLoadData = textLoad.readlines()
textLoad.close()

newData = []

for line in textLoadData:
    if line[0] == '[':
        aLineCut = line[10:]
        newData.append(aLineCut)

preparedText = ''.join(newData)
newFile = open(path + file + '_timeless', 'w')
newFile.write(preparedText)
newFile.close()

print('ALL DONE! File saved as ' + path + file + '_timeless')
Reply
#2
Some ways.
>>> s = '[11:29.53]Good morning, everyone'
>>> s.partition(']')[2]
'Good morning, everyone'
>>> import re
>>> 
>>> re.sub(r'\[.*]', '', s)
'Good morning, everyone'

textLoadData A song at right moment The PEP 8 Song🎵

f-string
print('ALL DONE! File saved as ' + path + file + '_timeless')
print(f'ALL DONE! File saved as {path}{file}_timeless')
EkaCaesium and Pedroski55 like this post
Reply
#3
Thanks, that's much better!

I never heard of .partition, but I suppose .split(']') would do the same job. Didn't think of that!
Reply
#4
Why use Python at all? Command line tools like sed are made for this sort of thing. Regular expressions (aka "regex") as shown above are also worth knowing something about. An example with sed:

Output:
$ cat test [11:29.53]Good morning, everyone. $ sed -i 's/\[.*\]//' test $ cat test Good morning, everyone.
To explain a bit:

sed allows you to edit lines of text with various commands. The command that's being used here is s for "substitute". For each line in the file test, we substitute the thing between the first pair of / (that is, the string that matches the regular expression \[.*\] - the square brackets are meaningful in regex, hence the need to escape them) with the thing between the second pair (i.e. the empty string). The -i option, acts on the file in place (without it, the results are just printed to standard out, so you could redirect them to another file, if you wanted).

The Grymoire has a sed tutorial here.
Pedroski55 likes this post
Reply
#5
(Apr-25-2021, 12:03 AM)Pedroski55 Wrote: I never heard of .partition, but I suppose .split(']') would do the same job. Didn't think of that!
Yes,just to show partition that more rare to use.
>>> s = '[11:29.53]Good morning, everyone'
>>> s.split(']')[-1]
'Good morning, everyone'
So a more elegant solution would could be like this.
See that there is no readlines() or close() used.
Iterate of file-object and with open() will close it automatically.
So now only line bye line is read into memory and not the whole file.
with open("in.txt") as f, open('out.txt', 'w') as f_out:
    for line in f:
        line = line.split(']')[-1]
        f_out.write(line)
import re

with open("in.txt") as f, open('out.txt', 'w') as f_out:
    for line in f:
        line = re.sub(r'\[.*]', '', line)
        f_out.write(line)
Pedroski55 likes this post
Reply
#6
Thanks again!

@ Da Bishop:I have seen sed in action, but it seems so cryptic, only robots can understand it! I am not R2D2!

But thanks for the link to sed, I will see if I can use it sometime, somewhere. I already have trouble with re!

@snippsat: Thank you, that looks like I think it should look like, but I can't write!!

Very grateful to you both!!
Reply
#7
There is Python friendly version of sed - sd. So it can be written as:

> sd -p '\[.*\]' '' test
-p flag is preview i.e. it will not change the file but you can see the result. If it is as expected flag can be omitted and actual change made
Pedroski55 likes this post
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Elegant way to apply each element of an array to a dataframe? sawtooth500 5 202 Yesterday, 09:36 PM
Last Post: deanhystad
  Is there a more elegant way to concatenate data frames? db042190 3 870 Jun-13-2023, 05:08 PM
Last Post: snippsat
  How to remove footer from PDF when extracting to text jh67 3 4,861 Dec-13-2022, 06:52 AM
Last Post: DPaul
  How to remove patterns of characters from text aaander 4 1,084 Nov-19-2022, 03:34 PM
Last Post: snippsat
  Editing text between two string from different lines Paqqno 1 1,287 Apr-06-2022, 10:34 PM
Last Post: BashBedlam
  Extracting Specific Lines from text file based on content. jokerfmj 8 2,866 Mar-28-2022, 03:38 PM
Last Post: snippsat
  raspberry use scrolling text two lines together fishbone 0 1,422 Sep-06-2021, 03:24 AM
Last Post: fishbone
  Want to remove the text from a particular column in excel shantanu97 2 2,094 Jul-05-2021, 05:42 PM
Last Post: eddywinch82
  how to connect mysql from txt 1 line goes good but not all lines in text kingceasarr 4 2,824 Mar-24-2021, 05:45 AM
Last Post: buran
  Assistance with running a few lines of code at an EXACT time nethatar 5 3,167 Feb-24-2021, 10:43 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020