Python Forum
Given a date I need to separate data into weeks
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Given a date I need to separate data into weeks
#11
This is the code that cleans up the data.
It captures the column header 'created' a long with the date values
The column header 'created' prevents me from converting the dates into datetime objects because it is not in the correct format.
I'm trying to separate it out with a simple if else statement but it does not seem to recognize it and it always goes into the else block.
I have tried passing the value with and without striping the it, values.strip()
I've added '/n' to the compare values=='created/n'
I've added spaces, ' created ' to the compare statement.


#############################this function is used to clean specific fields - not the overall document######################
def cleanTheData(ds):
 
    # removes excess characters from sys_created_on values
    cleanedData = list() # clears list
    for values in ds['sys_created_on']:
        values.strip()
        print('Before the if else block', values)
        if values=='created':
            cleanedData.append(re.sub(',+|\\r|\\n|', ' ', values))
            print('in the if statement', values)
        else:
            print('in the else statement', values)
            #cleanedData.append(datetime.strptime(values, '%d/%m/%y %H:%M'))
    ds['sys_created_on'] = cleanedData
 
    return ds
##############################End cleanTheData################################################################
Output:
Before the if else block created in the else statement created Before the if else block 4/4/2017 12:57 in the else statement 4/4/2017 12:57 Before the if else block 10/20/2016 11:28 in the else statement 10/20/2016 11:28 Before the if else block 3/29/2017 11:36 in the else statement 3/29/2017 11:36 Before the if else block 1/14/2017 8:41 in the else statement 1/14/2017 8:41
Reply
#12
Here's another way to clean your data, and separate into list of lists:
(I haven't applied week separation)
You can see how to access the data in the print routine:
(I put the data you showed in post 3 above into a sample test file named
SampleData (which I keep in a sub directory named data) so change the open to your file.
import csv


class SeparateByDate:
    def __init__(self):
        self.raw_data = None
        self.maxw = 0
        self.all_data = []

    def read_data(self):
        with open('data/SampleData.csv') as csvfile:
            self.raw_data = csv.reader(csvfile, delimiter='|')
            for row in self.raw_data:
                self.all_data.append(list(row))

    def get_max_width(self):
        self.maxw = 0
        longest_word = None
        for row in self.all_data:
            longest_word = max(row, key=len)
        self.maxw = len(longest_word)
        print('self.maxw: {}, longest_word: {}'.format(self.maxw, longest_word))


    def print_by_date(self):
        self.get_max_width()
        for n, row in enumerate(self.all_data):
            for item in row:
                print('{0:{1}} '.format(item, self.maxw), end='')
            print('')

def main():
    sd = SeparateByDate()
    sd.read_data()
    sd.print_by_date()

if __name__ == '__main__':
    main()
the output:
self.maxw: 23, longest_word: Most of these are blank
Output:
number                  assigned_to             business_duration       description             priority                sys_created_by          opened_at               u_pending_closed_on     resolved_by             sys_created_on           INC3270235              Most of these are blank 0                       Error messages go here  3                       700000120               5/6/17 3:20             5/6/17 15:20             INC3271723              Most of these are blank 0                       Error messages go here  3                       700000120               5/7/17 9:43             5/7/17 9:43              INC3279067              Most of these are blank 0                       Error messages go here  3                       700000120               5/8/17 9:03             5/8/17 9:03              INC3319147              Most of these are blank 0                       Error messages go here  3                       700000120               5/11/17 1:24            5/11/17 1:24             INC3331818              Most of these are blank 0                       Error messages go here  3                       700000120               5/11/17 3:21            5/11/17 15:21            INC3331966              Most of these are blank 0                       Error messages go here  3                       700000120               5/11/17 3:31            5/11/17 15:31            INC3335862              Most of these are blank 0                       Error messages go here  3                       700000120               5/12/17 6:17            5/12/17 6:17             INC3344128              Most of these are blank 0                       Error messages go here  3                       700000120               5/12/17 3:11            5/12/17 15:11            INC3368733              Most of these are blank 0                       Error messages go here  3                       700000120               5/16/17 1:24            5/16/17 1:24            
Reply
#13
Thank You larz60 I will try your solution
In the mean time can someone explain to me why this might be happening?
First why does
values.strip()
not take off the /n but
cleanedData.append(datetime.strptime(re.sub(',+|\\r|\\n|','', values), '%d/%m/%Y %H:%M'))
does?

Second why is the date 4/4/2017 12:57 in the proper format for the line
cleanedData.append(datetime.strptime(re.sub(',+|\\r|\\n|','', values), '%d/%m/%Y %H:%M'))
but '10/20/2016 11:28' is not?

As you can see from the output I have been able to separate the header 'created' from the dates that follows by adding the \n to the word in the if compare statement. When I tried to strip off the \n and do the compare it did not work, why?

What is different about the dates 4/4/2017 12:57 and 10/20/2016 11:28 that allows the format.strptime to process the first one but complain that the second one is not in the corrrect format?

Output:
Before the if else block created in the if statement created Before the if else block 4/4/2017 12:57 in the else statement 4/4/2017 12:57 Added to dictionary key [' c r e a t e d ', datetime.datetime(2017, 4, 4, 12, 57)] Before the if else block 10/20/2016 11:28 in the else statement 10/20/2016 11:28 Traceback (most recent call last): File "date_filter_error_report.py", line 211, in <module> parsedData = cleanTheData(parsedData) File "date_filter_error_report.py", line 113, in cleanTheData cleanedData.append(datetime.strptime(re.sub(',+|\\r|\\n|','', values), '%d/%m/%Y %H:%M')) File "C:\Python36\lib\_strptime.py", line 565, in _strptime_datetime tt, fraction = _strptime(data_string, format) File "C:\Python36\lib\_strptime.py", line 362, in _strptime (data_string, format)) ValueError: time data '10/20/2016 11:28' does not match format '%d/%m/%Y %H:%M'
def cleanTheData(ds):

    # removes excess characters from sys_created_on values
    cleanedData = list() # clears list
    for values in ds['sys_created_on']:
        print('Before the if else block', values)
        if values=='created\n':
            cleanedData.append(re.sub(',+|\\r|\\n|', ' ', values))
            print('in the if statement', values)
        else:
            print('in the else statement', values)
            cleanedData.append(datetime.strptime(re.sub(',+|\\r|\\n|','', values), '%d/%m/%Y %H:%M'))
            ds['sys_created_on'] = cleanedData
            print('Added to dictionary key', ds['sys_created_on'])

    return ds
Reply
#14
where did the data come from originally?
If a web site, perhaps a scraper would be a better fit.
I saw your pm. Thought the post 3 data was verbatim.
Reply
#15
your not saving values.strip()
you need to do this if your doing it that way...
values = values.strip()
>>> s = 'test\n'
>>> s
'test\n'
>>> s.strip()
'test'
>>> s
'test\n'
>>> s = s.strip()
>>> s
'test'
But i meant to do
cleanedData.append(datetime.strptime(values.strip(), '%d/%m/%Y %H:%M'))
Recommended Tutorials:
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare current date on calendar with date format file name Fioravanti 1 123 Mar-26-2024, 08:23 AM
Last Post: Pedroski55
  Python date format changes to date & time 1418 4 518 Jan-20-2024, 04:45 AM
Last Post: 1418
  importing functions from a separate python file in a separate directory Scordomaniac 3 1,330 May-17-2022, 07:49 AM
Last Post: Pedroski55
  Date format and past date check function Turtle 5 4,069 Oct-22-2021, 09:45 PM
Last Post: deanhystad
  How to add previous date infront of every unique customer id's invoice date ur_enegmatic 1 2,191 Feb-06-2021, 10:48 PM
Last Post: eddywinch82
  How to add date and years(integer) to get a date NG0824 4 2,805 Sep-03-2020, 02:25 PM
Last Post: NG0824
  Pulling & Reading Date from UDF that Compare it to Live Data firebird 4 2,706 Jul-20-2019, 09:30 AM
Last Post: snippsat
  Substracting today's date from a date in column of dates to get an integer value firebird 1 2,101 Jul-04-2019, 06:54 PM
Last Post: Axel_Erfurt
  How to change existing date to current date in a filename? shankar455 1 2,273 Apr-17-2019, 01:53 PM
Last Post: snippsat
  Date format conversion "/Date(158889600000)/" lbitten 2 2,790 Nov-29-2018, 02:14 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020