Python Forum
Sorting csv or txt file on date and time
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Sorting csv or txt file on date and time
#1
Hi,

I'm trying to sort a large txt file on 2 columns: date and then time: I've enclosed a small sample of this:

1,2,3,4,5,6,7,8,9,10

20190218,17:12:57,1903,1,0,,,,,20190217
20190218,18:8:34,1903,1,0,,,,,20190217
20190218,0:0:37,1903,1,0,,,,,20190218
20190218,11:46:48,1903,1,0,,,,,20190218

The list above is in the correct order I am looking for... ie sorted first by column 10 (date) and then by column 2 (time)

the actual data files are approx 20MB

i am struggling with the
sorted()
method particularly with converting the time from a string to a time.


any help would be great.
thanks
Reply
#2
This is tested with attached csv file I purposely messed up order of input file to test.
place file in same directory as script
import csv
import operator
import os

def sort_csv(filename, key1, key2):
    with open(filename) as fp:
        crdr = csv.reader(fp)
        # set key1 and key2 to column numbers of keys
        filedata = sorted(crdr, key=lambda row: (row[key1], row[key2]))
    return filedata

if __name__ == '__main__':
    # Assure in right directory
    os.chdir(os.path.abspath(os.path.dirname(__file__)))
    filename = 'st01_al_cou.csv'
    # these are column numbers zero based
    first_key = 2
    second_key = 3
    mylist = sort_csv(filename, first_key, second_key)
    for line in mylist:
        print(line)

Attached Files

.csv   st01_al_cou.csv (Size: 1.88 KB / Downloads: 361)
Reply
#3
Wow - thank you so much that works perfectly
Reply
#4
Alternative approach, not so good as Larz60+ but still might be usable (more akin to OP-s request: 'method particularly with converting the time from a string to a time'). Takes advantage of built-in datetime module, for demonstration purposes I use list:

from datetime import datetime

lst = [
       '20190218,18:8:34,1903,1,0,,,,,20190217',
       '20190218,0:0:37,1903,1,0,,,,,20190218',
       '20190218,11:46:48,1903,1,0,,,,,20190218',
       '20190218,17:12:57,1903,1,0,,,,,20190217'
      ]

def by_datetime(row):
    get_datetime = ','.join(row.split(',')[:2])
    return datetime.strptime(get_datetime, "%Y%m%d,%H:%M:%S")

sorted(lst, key=by_datetime)

['20190218,0:0:37,1903,1,0,,,,,20190218',
 '20190218,11:46:48,1903,1,0,,,,,20190218',
 '20190218,17:12:57,1903,1,0,,,,,20190217',
 '20190218,18:8:34,1903,1,0,,,,,20190217']
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#5
thanks again - will give that a try. Always nice to have alternatives
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare current date on calendar with date format file name Fioravanti 1 103 Mar-26-2024, 08:23 AM
Last Post: Pedroski55
  Date Time Series Help...Please spra8560 2 312 Feb-01-2024, 01:38 PM
Last Post: spra8560
  Python date format changes to date & time 1418 4 512 Jan-20-2024, 04:45 AM
Last Post: 1418
  Downloading time zone aware files, getting wrong files(by date))s tester_V 9 956 Jul-23-2023, 08:32 AM
Last Post: deanhystad
  Formatting a date time string read from a csv file DosAtPython 5 1,160 Jun-19-2023, 02:12 PM
Last Post: DosAtPython
  Wait til a date and time KatManDEW 2 1,387 Mar-11-2022, 08:05 PM
Last Post: KatManDEW
  Date format and past date check function Turtle 5 4,064 Oct-22-2021, 09:45 PM
Last Post: deanhystad
  File sorting by user-chosen category Bachelar 0 1,513 Aug-28-2021, 08:14 AM
Last Post: Bachelar
  Check last time file was accessed Pavel_47 4 2,759 Jun-01-2021, 05:47 PM
Last Post: Yoriz
  find the header location in a .bin file without reading the whole file at a time SANJIB 0 2,186 Mar-05-2021, 04:08 PM
Last Post: SANJIB

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020