Posts: 48
Threads: 17
Joined: Oct 2019
Nov-15-2019, 07:13 AM
(This post was last modified: Nov-15-2019, 07:14 AM by karlito.)
Hi,
my last question didn't find any help/answer and I found another approach and I wanted to know if it's possible to iterate over a column set as index(DateTime with pandas format:
2019-05-02 00:03:00
2019-05-02 00:08:00
2019-05-02 00:13:00
2019-05-02 00:18:00
2019-05-02 00:23:00
2019-05-02 00:28:00
2019-05-02 00:33:00
...
), so that during the iteration I can specify that the range from 00:03:00 to 23:59:00 is a day
(do something) and so on. I have issues dealing with date objects on pandas.
Thks for your help.
Posts: 1,950
Threads: 8
Joined: Jun 2018
Do you want to add new column based on time (with values like 'day', 'night' or whatever)? Or do you want iterate and if 'day time' is found do something with row?
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 48
Threads: 17
Joined: Oct 2019
(Nov-15-2019, 09:23 AM)perfringo Wrote: Do you want to add a new column based on time (with values like 'day', 'night' or whatever)? Or do you want to iterate and if 'day time' is found do something with row?
Thks for replying ... No I don't want to add a new column on time but what I'm trying to do, is to iterate on the index column of the df and do some operations in another column of that Dataframe (my Dataframe has 3 columns and I want to manipulate the df['x'] corresponding to that day range found and so on for others days that would be found!) hope it makes sense.
Posts: 1,950
Threads: 8
Joined: Jun 2018
I don't know whether it addresses your problem:
import pandas as pd
data = ['2019-05-02 00:03:00', '2019-05-02 00:08:00', '2019-05-02 00:13:00', '2019-05-02 00:18:00',
'2019-05-02 00:23:00', '2019-05-02 00:28:00', '2019-05-02 00:33:00']
df = pd.DataFrame({'num': range(len(data))}, index=pd.to_datetime(data)) It will create following DataFrame with DatetimeIndex:
Output: num
2019-05-02 00:03:00 0
2019-05-02 00:08:00 1
2019-05-02 00:13:00 2
2019-05-02 00:18:00 3
2019-05-02 00:23:00 4
2019-05-02 00:28:00 5
2019-05-02 00:33:00 6
Now we can set num value based on time:
df.iloc[df.index.indexer_between_time('00:13:00', '23:59:00')] = 20 Which will give:
Output: num
2019-05-02 00:03:00 0
2019-05-02 00:08:00 1
2019-05-02 00:13:00 20
2019-05-02 00:18:00 20
2019-05-02 00:23:00 20
2019-05-02 00:28:00 20
2019-05-02 00:33:00 20
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 2,125
Threads: 11
Joined: May 2017
Nov-15-2019, 07:17 PM
(This post was last modified: Nov-15-2019, 07:24 PM by DeaD_EyE.)
Previously I wrote a different answer and saw too late, that it was already answered.
If you need something, which generates datetime intervals e.g. for business days:
Here an example with dateutil.
You should read the documentation about dateutil.rrule.
It can generate intervals.
Here an example how to generate intervals with dateutil package.
Maybe it's useful together with pandas, but also without pandas.
import datetime as dt
# python3 -m pip install dateultil --user
# or install it in a venv
# often this package is installed, because other packages depends on this module
from dateutil.rrule import rrule, HOURLY
def hourly(start, interval, count):
"""
Finite generator for hourly intervals
count defines how many elements
"""
for dt_interval in rrule(freq=HOURLY, interval=interval, dtstart=start, count=count):
yield dt_interval
Posts: 48
Threads: 17
Joined: Oct 2019
(Nov-15-2019, 11:48 AM)perfringo Wrote: I don't know whether it addresses your problem:
import pandas as pd
data = ['2019-05-02 00:03:00', '2019-05-02 00:08:00', '2019-05-02 00:13:00', '2019-05-02 00:18:00',
'2019-05-02 00:23:00', '2019-05-02 00:28:00', '2019-05-02 00:33:00']
df = pd.DataFrame({'num': range(len(data))}, index=pd.to_datetime(data)) It will create following DataFrame with DatetimeIndex:
Output: num
2019-05-02 00:03:00 0
2019-05-02 00:08:00 1
2019-05-02 00:13:00 2
2019-05-02 00:18:00 3
2019-05-02 00:23:00 4
2019-05-02 00:28:00 5
2019-05-02 00:33:00 6
Now we can set num value based on time:
df.iloc[df.index.indexer_between_time('00:13:00', '23:59:00')] = 20 Which will give:
Output: num
2019-05-02 00:03:00 0
2019-05-02 00:08:00 1
2019-05-02 00:13:00 20
2019-05-02 00:18:00 20
2019-05-02 00:23:00 20
2019-05-02 00:28:00 20
2019-05-02 00:33:00 20
Thks for your time but its not really what I want to do
Posts: 1,950
Threads: 8
Joined: Jun 2018
My understanding has let me down. However, how should I interpret the problem:
Quote:I wanted to know if it's possible to iterate over a column set as index(DateTime /.../ so that during the iteration I can specify that the range from 00:03:00 to 23:59:00 is a day
(do something) and so on
I don't want to add a new column on time but what I'm trying to do, is to iterate on the index column of the df and do some operations in another column of that Dataframe (my Dataframe has 3 columns and I want to manipulate the df['x'] corresponding to that day range found and so on for others days that would be found!)
As I see it based on problem statement above:
Dataframe with DateTime as index. Check. Access to rows by time range. Check. Possibility to manipulate values. Check.
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 48
Threads: 17
Joined: Oct 2019
(Nov-18-2019, 08:06 AM)perfringo Wrote: My understanding has let me down. However, how should I interpret the problem:
Quote:I wanted to know if it's possible to iterate over a column set as index(DateTime /.../ so that during the iteration I can specify that the range from 00:03:00 to 23:59:00 is a day
(do something) and so on
I don't want to add a new column on time but what I'm trying to do, is to iterate on the index column of the df and do some operations in another column of that Dataframe (my Dataframe has 3 columns and I want to manipulate the df['x'] corresponding to that day range found and so on for others days that would be found!)
As I see it based on problem statement above:
Dataframe with DateTime as index. Check. Access to rows by time range. Check. Possibility to manipulate values. Check.
Hi,
I was trying to do something like this ... my bad if I express myself wrong!
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
df = pd.DataFrame(columns=['SomeDatetime', 'x', 'y'], index=[0,1,2,3,4,5])
now = datetime.now()
df.loc[0, 'SomeDatetime'] = now + timedelta(minutes = 10)
df.loc[1, 'SomeDatetime'] = now - timedelta(days = 1)
df.loc[2, 'SomeDatetime'] = now + timedelta(minutes = 15)
df.loc[3, 'SomeDatetime'] = now + timedelta(minutes = 20)
df.loc[4, 'SomeDatetime'] = now + timedelta(minutes = 50)
df.loc[5, 'SomeDatetime'] = now - timedelta(days = 30*4) - timedelta(days = 3)
df['x'] = pd.Series(np.random.randn(6))
df['y'] = pd.Series(np.random.randn(6))
df.set_index('SomeDatetime', inplace=True)
print("Dataframe\n")
print(df)
print("\nDay Loop\n")
for date in df.index.to_series().dt.date.unique():
print(date)
day_value = df[df.index.to_series().dt.date == date]
print(day_value)
print('\n')
|