Python Forum

Full Version: iterate over index and define each range as a day
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
my last question didn't find any help/answer and I found another approach and I wanted to know if it's possible to iterate over a column set as index(DateTime with pandas format:
2019-05-02 00:03:00
2019-05-02 00:08:00
2019-05-02 00:13:00
2019-05-02 00:18:00
2019-05-02 00:23:00
2019-05-02 00:28:00
2019-05-02 00:33:00
...
), so that during the iteration I can specify that the range from 00:03:00 to 23:59:00 is a day
(do something) and so on. I have issues dealing with date objects on pandas.
Thks for your help.
Do you want to add new column based on time (with values like 'day', 'night' or whatever)? Or do you want iterate and if 'day time' is found do something with row?
(Nov-15-2019, 09:23 AM)perfringo Wrote: [ -> ]Do you want to add a new column based on time (with values like 'day', 'night' or whatever)? Or do you want to iterate and if 'day time' is found do something with row?

Thks for replying ... No I don't want to add a new column on time but what I'm trying to do, is to iterate on the index column of the df and do some operations in another column of that Dataframe (my Dataframe has 3 columns and I want to manipulate the df['x'] corresponding to that day range found and so on for others days that would be found!) hope it makes sense.
I don't know whether it addresses your problem:

import pandas as pd

data = ['2019-05-02 00:03:00', '2019-05-02 00:08:00', '2019-05-02 00:13:00', '2019-05-02 00:18:00',
        '2019-05-02 00:23:00', '2019-05-02 00:28:00', '2019-05-02 00:33:00']

df = pd.DataFrame({'num': range(len(data))}, index=pd.to_datetime(data))
It will create following DataFrame with DatetimeIndex:

Output:
num 2019-05-02 00:03:00 0 2019-05-02 00:08:00 1 2019-05-02 00:13:00 2 2019-05-02 00:18:00 3 2019-05-02 00:23:00 4 2019-05-02 00:28:00 5 2019-05-02 00:33:00 6
Now we can set num value based on time:

df.iloc[df.index.indexer_between_time('00:13:00', '23:59:00')] = 20
Which will give:

Output:
num 2019-05-02 00:03:00 0 2019-05-02 00:08:00 1 2019-05-02 00:13:00 20 2019-05-02 00:18:00 20 2019-05-02 00:23:00 20 2019-05-02 00:28:00 20 2019-05-02 00:33:00 20
Previously I wrote a different answer and saw too late, that it was already answered.
If you need something, which generates datetime intervals e.g. for business days:

Here an example with dateutil.
You should read the documentation about dateutil.rrule.

It can generate intervals.
Here an example how to generate intervals with dateutil package.
Maybe it's useful together with pandas, but also without pandas.


import datetime as dt

# python3 -m pip install dateultil --user
# or install it in a venv
# often this package is installed, because other packages depends on this module

from dateutil.rrule import rrule, HOURLY


def hourly(start, interval, count):
    """
    Finite generator for hourly intervals
    count defines how many elements
    """
    for dt_interval in rrule(freq=HOURLY, interval=interval, dtstart=start, count=count):
        yield dt_interval
(Nov-15-2019, 11:48 AM)perfringo Wrote: [ -> ]I don't know whether it addresses your problem:

import pandas as pd

data = ['2019-05-02 00:03:00', '2019-05-02 00:08:00', '2019-05-02 00:13:00', '2019-05-02 00:18:00',
        '2019-05-02 00:23:00', '2019-05-02 00:28:00', '2019-05-02 00:33:00']

df = pd.DataFrame({'num': range(len(data))}, index=pd.to_datetime(data))
It will create following DataFrame with DatetimeIndex:

Output:
num 2019-05-02 00:03:00 0 2019-05-02 00:08:00 1 2019-05-02 00:13:00 2 2019-05-02 00:18:00 3 2019-05-02 00:23:00 4 2019-05-02 00:28:00 5 2019-05-02 00:33:00 6
Now we can set num value based on time:

df.iloc[df.index.indexer_between_time('00:13:00', '23:59:00')] = 20
Which will give:

Output:
num 2019-05-02 00:03:00 0 2019-05-02 00:08:00 1 2019-05-02 00:13:00 20 2019-05-02 00:18:00 20 2019-05-02 00:23:00 20 2019-05-02 00:28:00 20 2019-05-02 00:33:00 20

Thks for your time but its not really what I want to do
My understanding has let me down. However, how should I interpret the problem:

Quote:I wanted to know if it's possible to iterate over a column set as index(DateTime /.../ so that during the iteration I can specify that the range from 00:03:00 to 23:59:00 is a day
(do something) and so on

I don't want to add a new column on time but what I'm trying to do, is to iterate on the index column of the df and do some operations in another column of that Dataframe (my Dataframe has 3 columns and I want to manipulate the df['x'] corresponding to that day range found and so on for others days that would be found!)

As I see it based on problem statement above:

Dataframe with DateTime as index. Check. Access to rows by time range. Check. Possibility to manipulate values. Check.
(Nov-18-2019, 08:06 AM)perfringo Wrote: [ -> ]My understanding has let me down. However, how should I interpret the problem:

Quote:I wanted to know if it's possible to iterate over a column set as index(DateTime /.../ so that during the iteration I can specify that the range from 00:03:00 to 23:59:00 is a day
(do something) and so on

I don't want to add a new column on time but what I'm trying to do, is to iterate on the index column of the df and do some operations in another column of that Dataframe (my Dataframe has 3 columns and I want to manipulate the df['x'] corresponding to that day range found and so on for others days that would be found!)

As I see it based on problem statement above:

Dataframe with DateTime as index. Check. Access to rows by time range. Check. Possibility to manipulate values. Check.


Hi,

I was trying to do something like this ... my bad if I express myself wrong!

import numpy as np
import pandas as pd
from datetime import datetime, timedelta

df = pd.DataFrame(columns=['SomeDatetime', 'x', 'y'], index=[0,1,2,3,4,5])
now = datetime.now()
df.loc[0, 'SomeDatetime'] = now + timedelta(minutes = 10)
df.loc[1, 'SomeDatetime'] = now - timedelta(days = 1)
df.loc[2, 'SomeDatetime'] = now + timedelta(minutes = 15)
df.loc[3, 'SomeDatetime'] = now + timedelta(minutes = 20)
df.loc[4, 'SomeDatetime'] = now + timedelta(minutes = 50)
df.loc[5, 'SomeDatetime'] = now - timedelta(days = 30*4)  - timedelta(days = 3)

df['x'] = pd.Series(np.random.randn(6))
df['y'] = pd.Series(np.random.randn(6))

df.set_index('SomeDatetime', inplace=True)

print("Dataframe\n")
print(df)
print("\nDay Loop\n")

for date in df.index.to_series().dt.date.unique():
    print(date)
    day_value = df[df.index.to_series().dt.date == date]
    print(day_value)
    print('\n')