Mar-22-2020, 04:25 PM
There is quite a bit of info around summing up time formats in Python, but I'm not able to solve my use case.
I would like to sum up times based on certain criteria in my data frame. My data looks like this, where time is timedelta type:
The criteria is: sum Time when flag is 0, otherwise it just takes the max value.
I want it to look like this:
However, I get an error when I run this code:
I would like to sum up times based on certain criteria in my data frame. My data looks like this, where time is timedelta type:
Quote:User: 1 Flag: 0 Time: 04:00:03
User: 1 Flag: 0 Time: 00:25:00
User: 2 Flag: 1 Time: 04:20:00
User: 2 Flag: 1 Time: 01:26:00
User: 3 Flag: 2 Time: 1:00:01
User: 3 Flag: 2 Time: 14:00:02
The criteria is: sum Time when flag is 0, otherwise it just takes the max value.
I want it to look like this:
Quote:User: 1 Flag: 0 Time: 04:25:03
User: 1 Flag: 0 Time: 04:25:03
User: 2 Flag: 1 Time: 04:20:00
User: 2 Flag: 1 Time: 04:20:00
User: 3 Flag: 2 Time: 14:00:02
User: 3 Flag: 2 Time: 14:00:02
However, I get an error when I run this code:
groups = sample.groupby('user')['time'] flag = sample.groupby('user')['flag'].transform('max') sample['time_new'] = np.select([flag.eq(0), flag.isin([1,2])], [groups.transform('sum'), groups.transform('max')])
Quote:TypeError: Cannot cast scalar from dtype('<m8[ns]') to dtype('<m8') according to the rule 'same_kind'from using this code
sample.loc[:,'time'] = pd.to_timedelta(sample['time'])Is there another way to sum time? What am I doing wrong? Thank you very much for any help.