Python Forum
regex to extract only yy or yyyy
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
regex to extract only yy or yyyy
#1
Dear Python/Regex Experts,

I have two regex patterns that I use in Python that need a little improvement.

1. #m/d/yy month in Digits e.g. 1/2/98
pattern1 = r'(\d{1}/\d{1}/\d{2})'

I need an extra condition that after those final yy digits, there should be no other digits coming.
If they do, it is covered by a different pattern or not actually a date.

2. #yyyy e.g. 1984
pattern2 = '(\d{4})'

For the second pattern, I need to make sure that the year stands alone and has no more digits before or after.

I would really appreciate any help.
Reply
#2
You should post what you tried(like in working code),and also test example with input and wanted output
I can me a mess of text that you work with,or it can be more structured.
(Jul-11-2018, 11:29 AM)metalray Wrote: For the second pattern, I need to make sure that the year stands alone and has no more digits before or after.
>>> import re
>>> 
>>> s = '1980 100 18000 2000 112 2018'
>>> re.findall(r'(?<!\d)\d{4}(?!\d)', s)
['1980', '2000', '2018']

# only 2
>>> s = '19 100 18000 20 1234 1 55'
>>> re.findall(r'(?<!\d)\d{2}(?!\d)', s)
['19', '20', '55']
Reply
#3
{1} is an absolutely redundant modifier - \d and \d{1} are equivalent, so why would you want to add extra symbols? It is just wasteful.

{,1} is another issue - but again, ? does the same, only in one symbol instead of 4.

For 2 digits, e.g. \d\d is shorter than \d{2}, so I will still go for the former (but that is a matter of taste).

RE is complex enough without adding redundancies
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Need to sum up HHMMS AM/PM DD/MM/YYYY and a HH:MM:SS. tester_V 3 1,150 Jan-04-2023, 09:19 PM
Last Post: tester_V
  [SOLVED] Alternative to regex to extract date from whole timestamp? Winfried 6 1,775 Nov-16-2022, 01:49 PM
Last Post: carecavoador
  regex pattern to extract relevant sentences Bubly 2 1,835 Jul-06-2021, 04:17 PM
Last Post: Bubly
  Regex to extract IPs between () not working mrapple2020 5 3,375 Apr-12-2019, 08:03 AM
Last Post: DeaD_EyE

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020