Python Forum

Full Version: where is a pattern?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
i have a string pattern (such as '[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9][0-9]') that may be in another string. i'm still baffled by the re module (i guess i just can't get into a perl frame of mind). i want to know the start and end position where the matched substring is. but the .start and .end methods in a match object want an argument that makes no sense to me. who knows how to use this?

even better would be a function that can extract a date and time substring in any date and time format (even if it is ambiguous between date and month) and extract it and return the part before the date and time, the date and time substring, and the part after the date and time.
So you need a better strptime function (string parse time function) ?

https://docs.python.org/3.9/library/date...e-behavior
strptime() is a poor implementation, even in C. it can't deal well with a chance format and it can't find the date and time in a string (inside text in that string).
What does your pattern mean ?
The first part looks like a date, but the second ?

[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9][0-9]
2019-06-12.123
Skaperen Wrote:strptime() is a poor implementation, even in C. it can't deal well with a chance format and it can't find the date and time in a string (inside text in that string).
You may want to try the dateutil module's time parser.
(Jun-07-2019, 06:55 AM)heiner55 Wrote: [ -> ]What does your pattern mean ?
The first part looks like a date, but the second ?

[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9][0-9]
2019-06-12.123

it's part of a time. it would be 6 digits but the existence of 3 is all that is needed to be sure it's not one of the other patterns.
Ok, I understood.
That's pattern is way longer than it need to be,have you look into the basic regex before?
Quick test.
>>> import re
>>> 
>>> text = "foo 2019-06-12.123 bar"
>>> r = re.search(r"\d{4}-\d{2}-\d{2}\.\d{3}", text).group()
>>> r
'2019-06-12.123'
If it's a common valid date format can parse with dateutil as mention bye @Gribouillis
I like pendulum the best(and most correct) date tool that's is made for Python in the latest years.
>>> import pendulum
>>> 
>>> d = '2019-06-12'
>>> pendulum.parse(d)
DateTime(2019, 6, 12, 0, 0, 0, tzinfo=Timezone('UTC')
It will fail on auto parse with 2019-06-12.123,but can write a own with formatter.
>>> import pendulum
>>> 
>>> dt = pendulum.from_format('2019-06-12.123', 'YYYY-DD-MM.hms')
>>> dt
DateTime(2019, 12, 6, 1, 2, 3, tzinfo=Timezone('UTC'))
As you see pendulum dos this way better than strptime().
i had match bugs with that pattern an re-wrote the code this afternoon using a custom pattern format just for this case. if the pattern had a '0' it tested the character with .isdecimal(), else it compared the character to the pattern character. i had about 3100 files with a name that had the date+time on the end followed by some other stuff in many cases, with the original name at the front. i wanted to transpose each file's original name and date+time to be date+time then original name. some had various time formats, just to complicate things more. some had original names with numbers in them, even dates. but it's done now.