Python Forum

Full Version: Filter dataframe by datetime.date column
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I usually filter for data like this: df[df.somecolumn > 10]

However, I haven't been able to do this for a column I have in type "datetime.date"

If I do: df[df["date"] == "2020-11-30"]) I get nothing returned and no error message.

I have been googling for hours trying to fix this, so any input is welcome.

EDIT - I found a solution

I can do it like this: df[df["date"] == datetime.date(2020,11,30)])

I'm not sure why it can be done without this tweak for datetime64, but that format seems to be easier to use in this regard
You most make sure that date column is convert to pandas date format datetime64[ns].
Example
import pandas as pd

data = {
    "Salary": [1000, 2222, 3321, 4414, 5151],
    "Name": ["Pete", "Steve", "Brian", "Ryan", "Jim"],
    "Share": [29.88, 19.05, 8.17, 7.3, 6.15],
    "Date": [
        "11/24/2020",
        "12/21/2019",
        "10/14/2018",
        "12/13/2017",
        "01/08/2017",
    ],
}

df = pd.DataFrame(data) 
See that Date is object.
>>> df
   Salary   Name  Share        Date
0    1000   Pete  29.88  11/24/2020
1    2222  Steve  19.05  12/21/2019
2    3321  Brian   8.17  10/14/2018
3    4414   Ryan   7.30  12/13/2017
4    5151    Jim   6.15  01/08/2017

>>> df.dtypes
Salary      int64
Name       object
Share     float64
Date       object
dtype: objec

>>> df[df["Date"] == "2018-10-14"]
Empty DataFrame
Columns: [Salary, Name, Share, Date]
Index: []
Fix.
import pandas as pd

data = {
    "Salary": [1000, 2222, 3321, 4414, 5151],
    "Name": ["Pete", "Steve", "Brian", "Ryan", "Jim"],
    "Share": [29.88, 19.05, 8.17, 7.3, 6.15],
    "Date": [
        "11/24/2020",
        "12/21/2019",
        "10/14/2018",
        "12/13/2017",
        "01/08/2017",
    ],
}

df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
print(df[df["Date"] == "2018-10-14"])
Output:
Salary Name Share Date 2 3321 Brian 8.17 2018-10-14
Thanks for the input!

I have a further problem

I have a separate dataframe where the "date" column is in the format: <class 'pandas._libs.tslibs.timestamps.Timestamp'>

I want to filter the dataframe and return all rows where "date" is either a particular date or is in some interval.

If I do something like this I get the error: The truth value of a Series is ambiguous.

 (df["2020-01-01 00:00:00" < df["date"] < "2020-02-02 00:00:00"]) 
I have tried for hours but every google result has some variation I can't fix this by.