Removing timestamps from transcriptions

jehoshua · Dec-09-2018, 09:27 AM

(Dec-05-2018, 12:32 PM)metulburr Wrote: You could do it with regex and then just replace all double spaces with a single space. OR you can make a function to find all colons, and then remove 3 characters before, and 2 characters afterwords.

It's possible that there could be double spaces or colons in the transcript that are not part of the timestamp though, so it might be a bit risky ??

(Dec-05-2018, 02:49 PM)DeaD_EyE Wrote: Problem: 99:99 is also a valid match.

A better pattern:
timestamp = r'[012][0123456789]:[012345][0123456789] '
timestamp = r'[0-2]\d:[0-5]\d ' # short form
But this also allows values in timestamps like 25:00, which is an invalid time.

You can check each timestamp, if it's valid and if, then removing it.
The question is, do you need that?

I tried both of your solutions and they both worked. Had a very quick check through the timestamps with some searching, and seems 78:01 is the highest value. There are lots of values where the seconds value is '00'. The format is not hh:mm:ss , but mm:ss , so it seems having a value like 25:00 is okay.

I'm not sure if there are values like 25:60 , but would need to check as you stated.

(Dec-05-2018, 03:24 PM)buran Wrote: I guess timestamps are min:sec from start, not time like hh:mm, but it's up to OP to confirm that. In more broad aspect it raise the question of what are possible values, e.g. is it possible to have mmm:ss from start.

Yes, the format is min:sec from start, and the highest value is 78:01 , so only 2 numerics for the minutes. I guess this is a case of modifying the code to suit the data.

Thanks for those replies. :)

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	How to find tags using specific text (timestamps) in a url?	q988988	1	1,405	Mar-08-2022, 08:09 AM Last Post: buran
	Speech Recognition with timestamps	DeanAseraf1	3	6,688	Jun-27-2021, 06:58 PM Last Post: gh_ad
	Help on Flagging Timestamps	Daring_T	2	1,922	Oct-28-2020, 08:11 PM Last Post: Daring_T
	How to compare timestamps in python	asad	2	9,147	Oct-24-2018, 03:56 AM Last Post: asad

Removing timestamps from transcriptions

User Panel Messages

Announcements