Sorting csv or txt file on date and time - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Sorting csv or txt file on date and time (/thread-17068.html) |
Sorting csv or txt file on date and time - AshBax - Mar-27-2019 Hi, I'm trying to sort a large txt file on 2 columns: date and then time: I've enclosed a small sample of this: 1,2,3,4,5,6,7,8,9,10 20190218,17:12:57,1903,1,0,,,,,20190217 20190218,18:8:34,1903,1,0,,,,,20190217 20190218,0:0:37,1903,1,0,,,,,20190218 20190218,11:46:48,1903,1,0,,,,,20190218 The list above is in the correct order I am looking for... ie sorted first by column 10 (date) and then by column 2 (time) the actual data files are approx 20MB i am struggling with the sorted()method particularly with converting the time from a string to a time. any help would be great. thanks RE: Sorting csv or txt file on date and time - Larz60+ - Mar-27-2019 This is tested with attached csv file I purposely messed up order of input file to test. place file in same directory as script import csv import operator import os def sort_csv(filename, key1, key2): with open(filename) as fp: crdr = csv.reader(fp) # set key1 and key2 to column numbers of keys filedata = sorted(crdr, key=lambda row: (row[key1], row[key2])) return filedata if __name__ == '__main__': # Assure in right directory os.chdir(os.path.abspath(os.path.dirname(__file__))) filename = 'st01_al_cou.csv' # these are column numbers zero based first_key = 2 second_key = 3 mylist = sort_csv(filename, first_key, second_key) for line in mylist: print(line) RE: Sorting csv or txt file on date and time - AshBax - Mar-27-2019 Wow - thank you so much that works perfectly RE: Sorting csv or txt file on date and time - perfringo - Mar-27-2019 Alternative approach, not so good as Larz60+ but still might be usable (more akin to OP-s request: 'method particularly with converting the time from a string to a time'). Takes advantage of built-in datetime module, for demonstration purposes I use list: from datetime import datetime lst = [ '20190218,18:8:34,1903,1,0,,,,,20190217', '20190218,0:0:37,1903,1,0,,,,,20190218', '20190218,11:46:48,1903,1,0,,,,,20190218', '20190218,17:12:57,1903,1,0,,,,,20190217' ] def by_datetime(row): get_datetime = ','.join(row.split(',')[:2]) return datetime.strptime(get_datetime, "%Y%m%d,%H:%M:%S") sorted(lst, key=by_datetime) ['20190218,0:0:37,1903,1,0,,,,,20190218', '20190218,11:46:48,1903,1,0,,,,,20190218', '20190218,17:12:57,1903,1,0,,,,,20190217', '20190218,18:8:34,1903,1,0,,,,,20190217'] RE: Sorting csv or txt file on date and time - AshBax - Mar-27-2019 thanks again - will give that a try. Always nice to have alternatives |