Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
date validation
#1
Hi,

I have started learning Python not far ago. I have a program that search for dates in a string (this part works ok) and than checks date correctness (e.g. February cannot have 30 days). I know that there is modul datetime but I wanted to check myself and do it on my own. Please note, that February has 29 days in leap years. Leap years are every year evenly divisible by 4, except for years evenly divisible by 100, unless the year is also evenly divisible by 400.

import re
 
dates = []
eachDate = []
text = '30/02/2000, 29/02/2000, 30/02/2100, 29/02/2100, 31/02/2004, 30/02/2004, 29/02/2004'
 
 
dateRegex = re.compile(r'''(
(0\d|1\d|2\d|30|31)   #day
(/)
(0\d|10|11|12)   #month
(/)
(1\d\d\d|2\d\d\d)
)''', re.VERBOSE) 
 
for groups in dateRegex.findall(text):
    eachDate = []
    eachDate.append(groups[1])
    eachDate.append(groups[3])
    eachDate.append(groups[5])
    dates.append(eachDate)
print(dates)
 
for item in dates:
    print(item)
    if item[1] in ('04', '06', '09', '11'):
        if item[0] == '31':
            dates.remove(item)
    elif item[1] == '02':
        if item[0] == '30':
            dates.remove(item)
        elif item[0] == '31':
            dates.remove(item)
        elif item[0] == '29':
            if int(item[2]) % 400 == 0:
                continue
            elif int(item[2]) % 100 == 0:
                dates.remove(item)
            elif int(item[2]) % 4 == 0:
                continue
            else:
                dates.remove(item)
 
print(dates)
results:
[['30', '02', '2000'], ['29', '02', '2000'], ['30', '02', '2100'], ['29', '02', '2100'], ['31', '02', '2004'], ['30', '02', '2004'], ['29', '02', '2004']]
['30', '02', '2000']
['30', '02', '2100']
['31', '02', '2004']
['29', '02', '2004']
[['29', '02', '2000'], ['29', '02', '2100'], ['30', '02', '2004'], ['29', '02', '2004']]

List dates is correct (7 elements), but for loop has only four elements (I add additional print, because I got confused). Final result is a list with 4 different dates (but it has mistakes e.g. ['30', '02', '2004']). What did I wrong?
Reply
#2
To validate days in month:
Note (I wrote this without testing, so check for typos)
  • split date on '/'
  • create list with number of days in each month
    daymon = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
  • check if leap year
        def is_leap(year):
            retval = 0
            if ((not year % 400) or ((not year % 4) and (year % 100))):
                retval = 1
            return retval
        
  • n = int(month) - 1
  •    days_in_month = daymon[n]
       if n == 1:
           days_in_month = daymon[n] + is_leap(year)
       
Reply
#3
This is a good use for datetime.datetime.strptime. Everything, what you try so solve manually, is already implemented in datetime module.

But solving it manually is a good way to learn more about our crazy calendar.
A better regex for your task:

import regex


day_month_year = re.compile(r'^(?P<day>\d{2})/(?P<month>\d{2})/(?P<year>\d{4})$')
year_month_day = re.compile(r'^(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})$')
The objects day_month_year and year_month_day are compiled regex objects.
They have methods like group(), groups() and groupdict()
You can give the groups names and the groupdict could be used to get the information by keys (a dict).

Using the compiled regex:

match = day_month_year.search('01/01/2038')
# match could be a re.Match or None, you've to check it
# if nothing was found, re.search return None.

if match:
    print('Found a match')
    print(match.groupdict())
else:
    print('Nothing found')
The values are still str, they must be cast to int for calculation.
With Exception handling you could handle the case, if the date_string was not valid.
import re                                                                                 
                                                                                          
                                                                                          
def get_date(date_string, regex):                                                         
    match = regex.search(date_string)                                                     
    if match:                                                                             
        return int(match['year']), int(match['month']), int(match['day'])                 
    raise ValueError(f'date_string {date_string} is invalid')                             
                                                                                          
                                                                                          
                                                                                          
day_month_year_regex = re.compile(r'^(?P<day>\d{2})/(?P<month>\d{2})/(?P<year>\d{4})$')   
year_month_day_regex = re.compile(r'^(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})$')   
                                                                                          
                                                                                          
try:                                                                                      
    year, day, month = get_date('2038-10-01', year_month_day_regex)                       
except ValueError as error:                                                               
    print(error)                                                                          
else:                                                                                     
    print(year, month, day)                                                               
                                                                                          
try:                                                                                      
    year, day, month = get_date('01/10/2038', day_month_year_regex)                       
except ValueError as error:                                                               
    print(error)                                                                          
else:                                                                                     
    print(year, month, day)                                                               
BTW: To test regex, you could visit https://regex101.com/

PS: The regex I used is very strict. White space before or after the date results into a Exception.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
Quote: I know that there is modul datetime but I wanted to check myself and do it on my own.
OP specifically states they don't want to use datetime.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare current date on calendar with date format file name Fioravanti 1 120 Mar-26-2024, 08:23 AM
Last Post: Pedroski55
  Python date format changes to date & time 1418 4 516 Jan-20-2024, 04:45 AM
Last Post: 1418
  Date format and past date check function Turtle 5 4,068 Oct-22-2021, 09:45 PM
Last Post: deanhystad
  How to add previous date infront of every unique customer id's invoice date ur_enegmatic 1 2,190 Feb-06-2021, 10:48 PM
Last Post: eddywinch82
  How to add date and years(integer) to get a date NG0824 4 2,803 Sep-03-2020, 02:25 PM
Last Post: NG0824
  Substracting today's date from a date in column of dates to get an integer value firebird 1 2,099 Jul-04-2019, 06:54 PM
Last Post: Axel_Erfurt
  How to change existing date to current date in a filename? shankar455 1 2,271 Apr-17-2019, 01:53 PM
Last Post: snippsat
  Date format conversion "/Date(158889600000)/" lbitten 2 2,790 Nov-29-2018, 02:14 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020