Python Forum

Full Version: How To Sum Numbers Of Next 7 Days
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I would like to sum up the seconds that each employee spoke on the phone over the next 7 days.

ID; Person_nr; Name; Datum; Datum_plus7; Call_As_Sekunde
3469; 1111; Musterman, Muster; 19.01.2020; 25.01.2020; 0
3470; 1111; Musterman, Muster; 20.01.2020; 26.01.2020; 30600
3471; 1111; Musterman, Muster; 21.01.2020; 27.01.2020; 30600
3472; 1111; Musterman, Muster; 22.01.2020; 28.01.2020; 30600
3473; 1111; Musterman, Muster; 23.01.2020; 29.01.2020; 30600
3474; 1111; Musterman, Muster; 24.01.2020; 30.01.2020; 18000
3475; 1111; Musterman, Muster; 25.01.2020; 31.01.2020; 0
3476; 1111; Musterman, Muster; 26.01.2020; 01.02.2020; 0
3477; 1111; Musterman, Muster; 27.01.2020; 02.02.2020; 30600
3478; 1111; Musterman, Muster; 28.01.2020; 03.02.2020; 30600
3479; 1111; Musterman, Muster; 29.01.2020; 04.02.2020; 30600
3480; 1111; Musterman, Muster; 30.01.2020; 05.02.2020; 30600
3481; 1111; Musterman, Muster; 31.01.2020; 06.02.2020; 18000
16698;1111; Musterman, Muster; 01.02.2020; 07.02.2020; 0
16699;1111; Musterman, Muster; 02.02.2020; 08.02.2020; 0
16700;1111; Musterman, Muster; 03.02.2020; 09.02.2020; 30600
16701;1111; Musterman, Muster; 04.02.2020; 10.02.2020; 30600
16702;1111; Musterman, Muster; 05.02.2020; 11.02.2020; 30600
16703;1111; Musterman, Muster; 06.02.2020; 12.02.2020; 30600
16704;1111; Musterman, Muster; 07.02.2020; 13.02.2020; 18000
16705;1111; Musterman, Muster; 08.02.2020; 14.02.2020; 0
16706;1111; Musterman, Muster; 09.02.2020; 15.02.2020; 0
I wrote this code:
i=0     
j=0     
Summe=0       
total_7days=[]
#Current Day in row                                                            
for i in range(0,24): 
      # the next 7 days                                                                                                                                                                              
      for j in range(0,7):
       if (df['Datum'][i+j] <= df['Datum_plus7'][i] and df['Personal_nr'][i+j] == df['Personal_nr'][i]  ):
           Summe +=df['Call_AS_Sekunde'][j]
           print(str(df['Personal_nr'][i])+ '---'+ str(df['Personal_nr'][i+j])+ '---' + df['Datum'][i+j] +'---' +df['Datum_plus7'][i]+'---'+ str(df['call_As_Sekunde'][i+j]))
   total_7days.append(Summe)
   Summe=0
   print(total_7days[i])
Output:
Person_nr Datum Datum_7_Days Call_As_Sekunde ... 1111---1111---25.01.2020---31.01.2020--- 0 1111---1111---26.01.2020---31.01.2020--- 0 1111---1111---27.01.2020---31.01.2020--- 30600 1111---1111---28.01.2020---31.01.2020--- 30600 1111---1111---29.01.2020---31.01.2020--- 30600 1111---1111---30.01.2020---31.01.2020--- 30600 1111---1111---31.01.2020---31.01.2020--- 18000 SUM of last seven days=140400 1111---1111---01.02.2020---01.02.2020--- 0 SUM of last seven days=30600
Error:
The code works correctly until 25.01.2020. From 26.01.2020 on it doesn't work properly because the 7th day is in the next month. I mean from 26.01.2020 to 31.01.2020 there is no calculation, because the next 7 days are in the next month! [b]26.01.2020 to 01.02.2020 ??? 27.01.2020 to 02.02.2020 ??? .. 31.01.2020 to 06.02.2020 ???[/b]
How can I solve this problem?

Thanks a lot for your help Smile

Best regards
It is usually good idea to take advantage of stuff which is built-in into Python for specific reason.

>>> from datetime import datetime, timedelta
>>> my_date = '25.01.2020'
>>> date = datetime.strptime(my_date, '%d.%m.%Y')
>>> date
datetime.datetime(2020, 1, 25, 0, 1)
>>> date + timedelta(7)
datetime.datetime(2020, 2, 1, 0, 1)
If data in tabular form you may look into Pandas.
Thanks a lot for your suggestion. But I have tried with this method. I became Error by datetime.strptime!



(Oct-19-2020, 11:42 AM)perfringo Wrote: [ -> ]It is usually good idea to take advantage of stuff which is built-in into Python for specific reason.

>>> from datetime import datetime, timedelta
>>> my_date = '25.01.2020'
>>> date = datetime.strptime(my_date, '%d.%M.%Y')
>>> date
datetime.datetime(2020, 1, 25, 0, 1)
>>> date + timedelta(7)
datetime.datetime(2020, 2, 1, 0, 1)
If data in tabular form you may look into Pandas.
by executing date = datetime.strptime(df['datum'], '%d.%M.%Y')become this Error:

Error:
TypeError: strptime() argument 1 must be str, not Series
My initial code consisted a typo which I fixed: instead of %M (which is minutes) should be %m (which is month). You can look up formatting codes from documentation: strftime and strptime format codes

This example was for string to datetime, not for dataframe column.
ok. thanks for your answer. do you know what is the problem with my code? How can I solive it?
Assuming that data you provided is in csv file named 'pandas_data_range.csv' then this should work:

- read data to DataFrame using following settings: sep - regex which uses ; as separator and clears all spaces before and after, engine - use Python instead of c (due to regex syntax), header - first row in file, parse_dates - parse values in col 'Datum_plus7' into dates; date_parser - use pd.to_datetime for parsing

- create report which filters 'Datum_plus7' column for specific date range, groups it by 'Name' column and uses 'Call_As_Sekunde' to sum values.

import pandas as pd

df = pd.read_csv('pandas_date_range.csv', sep="\s*[;]\s*", engine='python', header=[0],
                 parse_dates=['Datum_plus7'],
                 date_parser=pd.to_datetime)

report = df[df['Datum_plus7'].isin(pd.date_range('2020-01-25',
                                                 '2020-02-02'))].groupby(by=['Name'])['Call_As_Sekunde'].sum()

print(report)
Output of print should be somathing like:

Output:
Name Musterman, Muster 171000 Name: Call_As_Sekunde, dtype: int64
I have no idea on which dates etc report should be generated so I just hardcoded start and end in date_range.
I should calculate from every day the next 7 days. therfore you donot need to write it as hardcore!

for Example: my idae is:
...
3476; 1111; Musterman, Muster; 26.01.2020; 01.02.2020; 0
3477; 1111; Musterman, Muster; 27.01.2020; 02.02.2020; 30600
3478; 1111; Musterman, Muster; 28.01.2020; 03.02.2020; 30600
3479; 1111; Musterman, Muster; 29.01.2020; 04.02.2020; 30600
3480; 1111; Musterman, Muster; 30.01.2020; 05.02.2020; 30600
3481; 1111; Musterman, Muster; 31.01.2020; 06.02.2020; 18000
16698;1111; Musterman, Muster; 01.02.2020; 07.02.2020; 0
SUM= 140400
I mean, you take the current value from column date and for the next 7 days you sum up the call_as_second together.
For the next row:
3477; 1111; Musterman, Muster; 27.01.2020; 02.02.2020; 30600
3478; 1111; Musterman, Muster; 28.01.2020; 03.02.2020; 30600
3479; 1111; Musterman, Muster; 29.01.2020; 04.02.2020; 30600
3480; 1111; Musterman, Muster; 30.01.2020; 05.02.2020; 30600
3481; 1111; Musterman, Muster; 31.01.2020; 06.02.2020; 18000
16698;1111; Musterman, Muster; 01.02.2020; 07.02.2020; 0
16699;1111; Musterman, Muster; 02.02.2020; 08.02.2020; 0
Sum= 140400
and so on...