Python Forum
regex to extract only yy or yyyy - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: regex to extract only yy or yyyy (/thread-11489.html)



regex to extract only yy or yyyy - metalray - Jul-11-2018

Dear Python/Regex Experts,

I have two regex patterns that I use in Python that need a little improvement.

1. #m/d/yy month in Digits e.g. 1/2/98
pattern1 = r'(\d{1}/\d{1}/\d{2})'

I need an extra condition that after those final yy digits, there should be no other digits coming.
If they do, it is covered by a different pattern or not actually a date.

2. #yyyy e.g. 1984
pattern2 = '(\d{4})'

For the second pattern, I need to make sure that the year stands alone and has no more digits before or after.

I would really appreciate any help.


RE: regex to extract only yy or yyyy - snippsat - Jul-11-2018

You should post what you tried(like in working code),and also test example with input and wanted output
I can me a mess of text that you work with,or it can be more structured.
(Jul-11-2018, 11:29 AM)metalray Wrote: For the second pattern, I need to make sure that the year stands alone and has no more digits before or after.
>>> import re
>>> 
>>> s = '1980 100 18000 2000 112 2018'
>>> re.findall(r'(?<!\d)\d{4}(?!\d)', s)
['1980', '2000', '2018']

# only 2
>>> s = '19 100 18000 20 1234 1 55'
>>> re.findall(r'(?<!\d)\d{2}(?!\d)', s)
['19', '20', '55']



RE: regex to extract only yy or yyyy - volcano63 - Jul-11-2018

{1} is an absolutely redundant modifier - \d and \d{1} are equivalent, so why would you want to add extra symbols? It is just wasteful.

{,1} is another issue - but again, ? does the same, only in one symbol instead of 4.

For 2 digits, e.g. \d\d is shorter than \d{2}, so I will still go for the former (but that is a matter of taste).

RE is complex enough without adding redundancies