Python Forum
Help with a regex? (solved) - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Help with a regex? (solved) (/thread-39890.html)



Help with a regex? (solved) - wrybread - May-01-2023

I'm trying to isolate the number in a filename that can look like these:

DJI_0991.MP4
DJI_0992 - testing.MP4

The pattern is that it always starts with DJI_ and ends with either a period or a space.

Nevermind, solved! I don't know why this was giving me so much trouble. On the offchance anyone else comes this way:

import re

filename = "DJI_0991.MP4"

file_number = re.findall('DJI_(.+?)\W', filename)

print (file_number )



RE: Help with a regex? (solved) - deanhystad - May-01-2023

It probably gave you trouble because of the non-greedy matching. Another possibility is you might have used a control sequence where you wanted a backslash.

Why not grab all the digits instead of looking for the character after the digits?
import re

pattern = re.compile(r'DJI_(\d+)')

matches = re.findall(pattern, 'DJI_0991.MP4 DJI_0992 - testing.MP4 JI_0993 - testing.MP4 DJI 0994.MP4')
 
print(matches)
Output:
['0991', '0992']



RE: Help with a regex? (solved) - wrybread - May-01-2023

Because sometimes there's other digits in the filenames too, like "DJI_0090 - part 1.MP4"


RE: Help with a regex? (solved) - deanhystad - May-01-2023

I don't understand. In the example below should it print 0090 or not?
import re

pattern = re.compile(r'DJI_(\d+)')

matches = re.findall(pattern, 'DJI_0991.MP4 DJI_0992 - testing.MP4 DJI_0090 - part 1.MP4')
 
print (matches)
Output:
['0991', '0992', '0090']