Mar-29-2024, 10:32 AM
(This post was last modified: Mar-29-2024, 10:32 AM by deanhystad.)
This gives you the first and last time for each day. Does it solve your problem?
iimport pandas as pd from datetime import datetime, timedelta from time import time now = datetime.now() df = pd.DataFrame({"time": [now + timedelta(seconds=x) for x in range(930000)]}) start = time() df["day"] = df.time.dt.day df2 = df[df.day.shift(1) != df.day.shift(-1)] print(time() - start) print(df2)
Output:0.0659632682800293
time day
0 2024-03-28 16:26:10.637194 28
27229 2024-03-28 23:59:59.637194 28
27230 2024-03-29 00:00:00.637194 29
113629 2024-03-29 23:59:59.637194 29
113630 2024-03-30 00:00:00.637194 30
200029 2024-03-30 23:59:59.637194 30
200030 2024-03-31 00:00:00.637194 31
286429 2024-03-31 23:59:59.637194 31
286430 2024-04-01 00:00:00.637194 1
372829 2024-04-01 23:59:59.637194 1
372830 2024-04-02 00:00:00.637194 2
459229 2024-04-02 23:59:59.637194 2
459230 2024-04-03 00:00:00.637194 3
545629 2024-04-03 23:59:59.637194 3
Another approach is to extract the day as above, then group the dataframe by day. You could compute the high, low, nean, open, close for each day.