Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
python sort date
#1
I am new to python and got a requirement to sort the content of the text file based on timestamp in reverse order. Below is the content of text file(in.txt)

2020/10/31:09:05:01 734691 445750 384860 557946
2020/10/31:15:05:01 734691 366500 315620 554140
2020/10/31:21:05:01 705959 177500 153041 513408

Below was written but getting the below error.

from datetime import datetime

with open('in.txt') as f:
     sorted_lines = sorted([l.rstrip() for l in f.readlines()],
                          key=lambda line: datetime.strptime(line.split(" ")[0], "%Y/%m/%d:%H:%M:%S"))
                            "%Y/%m/%d:%H:%M:%S"),reverse=True)
     for line in sorted_lines:
        print(line)
Error
key=lambda line: datetime.strptime(line.split(" ")[0], "%Y/%m/%d:%H:%M:%S"),reverse=True)
File "/usr/lib64/python2.7/_strptime.py", line 325, in _strptime
(data_string, format))
ValueError: time data '' does not match format '%Y/%m/%d:%H:%M:%S'

Unable to determine why the error is happening.Could please help.
Reply
#2
from the error message it looks like you have empty line(s) probably at the end of the file
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#3
from datetime import datetime


def parse_datetime(line):
    date_str, _ = line.split(maxsplit=1)
    date_fmt = "%Y/%m/%d:%H:%M:%S"
    return datetime.strptime(date_str, date_fmt)


with open("in.txt") as fd:
    for line in sorted(fd, key=parse_datetime, reverse=True):
        # line is still a str
        print(line.strip())
        # print(line, end="")
This will still fail with corrupt data.
You can catch these errors and return datetime.min, which is the earliest possible date represented by datetime.


def parse_datetime(line):
    date_fmt = "%Y/%m/%d:%H:%M:%S"
    try:
        date_str, _ = line.split(maxsplit=1)
        return datetime.strptime(date_str, date_fmt)
    except ValueError:
        # invalid format
        return datetime.min
        # datetime.min == datetime.datetime(1, 1, 1, 0, 0)
        # used as minimum value
        # you can't mix if you sort, so all elements must be a datetime
  • fd (the file object) is an iterator. Iterating over the fd will split the lines.
  • sort(fd) will sort all lines of the whole file in lexicographical order.
  • The key function of sort return a datetime object.
  • sorting requires comparison. Python has a strong TypeSafety, so you can't for example compare an int with a str. But you can ship around with the key function, which always return the same type.

    If you want to put the data into a data-structure (e.g. a dict or namedtuple), I would do the parsing first, put this data into a list and sort the list, when everything has been finished.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#4
I get a syntax error working with you code. Is there a paste error in you post? Why does the date time pattern appear twice?
Reply
#5
Since the dates are nearly in iso form (with YYYY/MM/DD etc.), you may simply sort the lexicographic form.

sorted_lines = sorted(f, reverse=True)
Reply
#6
(Nov-04-2020, 05:23 PM)chrischarley Wrote: Since the dates are nearly in iso form (with YYYY/MM/DD etc.), you may simply sort the lexicographic form.

sorted_lines = sorted(f, reverse=True)
But do we know for sure it is MM and DD?
Reply
#7
(Nov-04-2020, 05:30 PM)deanhystad Wrote:
(Nov-04-2020, 05:23 PM)chrischarley Wrote: Since the dates are nearly in iso form (with YYYY/MM/DD etc.), you may simply sort the lexicographic form.

sorted_lines = sorted(f, reverse=True)
But do we know for sure it is MM and DD?

That is true I didn't consider that possibility . But, his hours, minutes and seconds were 0 padded when less than 10.
Reply
#8
(Nov-04-2020, 05:18 PM)DeaD_EyE Wrote:
from datetime import datetime


def parse_datetime(line):
    date_str, _ = line.split(maxsplit=1)
    date_fmt = "%Y/%m/%d:%H:%M:%S"
    return datetime.strptime(date_str, date_fmt)


with open("in.txt") as fd:
    for line in sorted(fd, key=parse_datetime, reverse=True):
        # line is still a str
        print(line.strip())
        # print(line, end="")
This will still fail with corrupt data.
You can catch these errors and return datetime.min, which is the earliest possible date represented by datetime.


def parse_datetime(line):
    date_fmt = "%Y/%m/%d:%H:%M:%S"
    try:
        date_str, _ = line.split(maxsplit=1)
        return datetime.strptime(date_str, date_fmt)
    except ValueError:
        # invalid format
        return datetime.min
        # datetime.min == datetime.datetime(1, 1, 1, 0, 0)
        # used as minimum value
        # you can't mix if you sort, so all elements must be a datetime
  • fd (the file object) is an iterator. Iterating over the fd will split the lines.
  • sort(fd) will sort all lines of the whole file in lexicographical order.
  • The key function of sort return a datetime object.
  • sorting requires comparison. Python has a strong TypeSafety, so you can't for example compare an int with a str. But you can ship around with the key function, which always return the same type.

    If you want to put the data into a data-structure (e.g. a dict or namedtuple), I would do the parsing first, put this data into a list and sort the list, when everything has been finished.



Hi

I got the below error, using python version 2.7.5. Will that be due to version.
date_str, _ = line.split(maxsplit=1)
TypeError: split() takes no keyword arguments

So changed as below with no arguments for split. But that also didnt help.
date_str, _ = line.split()
ValueError: too many values to unpack
Reply
#9
(Nov-04-2020, 05:23 PM)deanhystad Wrote: I get a syntax error working with you code. Is there a paste error in you post? Why does the date time pattern appear twice?


No, I tested it again. Does work with Python 3.9 and should work with older versions.
Maybe the comments are confusing the repl, if you use copy&paste.

The parse_date was made twice to show:
  1. How to split tasks into smaller easier tasks -> better for testing. For example, you can use this function to test each line. No file for testing needed at all.
  2. How to catch Exceptions, handle them, retuning a default value for sorting.

To understand the sorting part, try this:

sort(sorted([1, 2, 3, 4, "a"]))
Error:
TypeError Traceback (most recent call last) <ipython-input-4-f654efc1df6a> in <module> ----> 1 sorted([1, 2, 3, 4, "a"]) TypeError: '<' not supported between instances of 'str' and 'int'
Same with datetime objects.
This works because datetime is comparable with datetime:
sorted([datetime.min, datetime.max, datetime(2020,1,1), datetime(1990,1,1)])
But this won't work:
sorted([datetime(2020,1,1), 1])
Error:
TypeError: '<' not supported between instances of 'int' and 'datetime.datetime'
This is why a key-function for sorting should always return the same type.
You could remove lines with wrong formatting before you sort.
Or you have a situation, where you want still to keep the lines with wrong format and sorting them.

There is no one universal solution for all.
Keep learning the basics before you switch to pandas.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#10
(Nov-04-2020, 05:52 PM)beginner2020 Wrote: I got the below error, using python version 2.7.5. Will that be due to version.
date_str, _ = line.split(maxsplit=1)
TypeError: split() takes no keyword arguments

I saw it too late.

You should avoid the use of Python 2.7. It's end of life.
Python 3.6 is the oldest available version which has still support.
Python 3.5 has also reached the end of life.

That maxsplit can be a keyword-argument was introduced with Python 3.3.

Start your repl or script with python3.
Just python without a 3 will start the Python 2.7 interpreter on the most distributions.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Compare current date on calendar with date format file name Fioravanti 1 251 Mar-26-2024, 08:23 AM
Last Post: Pedroski55
  Python date format changes to date & time 1418 4 623 Jan-20-2024, 04:45 AM
Last Post: 1418
  How to see the date of installation of python modules. newbieAuggie2019 4 1,641 Mar-31-2023, 12:40 PM
Last Post: newbieAuggie2019
Photo a.sort() == b.sort() all the time 3lnyn0 1 1,328 Apr-19-2022, 06:50 PM
Last Post: Gribouillis
  Date format and past date check function Turtle 5 4,281 Oct-22-2021, 09:45 PM
Last Post: deanhystad
  How to sort values descending from a row in a dataframe using python sankarachari 1 1,427 Aug-16-2021, 08:55 AM
Last Post: jamesaarr
  How to add previous date infront of every unique customer id's invoice date ur_enegmatic 1 2,245 Feb-06-2021, 10:48 PM
Last Post: eddywinch82
  How to add date and years(integer) to get a date NG0824 4 2,893 Sep-03-2020, 02:25 PM
Last Post: NG0824
  What's the best way to sort this data with Python? xtrax 2 1,702 Mar-15-2020, 08:08 AM
Last Post: Larz60+
  Date from from excel to Python. atp_13 1 1,780 Nov-24-2019, 12:26 PM
Last Post: ibreeden

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020