python pandas: diff between 2 dates in a groupby

bluedragon · Mar-25-2020, 04:18 PM

Hi all,

I'm trying to implement this example:

import pandas as pd
import io
df = pd.read_csv(io.StringIO('''transactionid;event;datetime;info
1;START;2017-04-01 00:00:00;
1;END;2017-04-01 00:00:02;foo1
2;START;2017-04-01 00:00:02;
3;START;2017-04-01 00:00:02;
2;END;2017-04-01 00:00:03;foo2
4;START;2017-04-01 00:00:03;
3;END;2017-04-01 00:00:03;foo3
4;END;2017-04-01 00:00:04;foo4'''), sep=';', parse_dates=['datetime'])

df.datetime = pd.to_datetime(df.datetime)

funcs = {
    'datetime':{
        'start_date':   'min',
        'end_date':     'max',
        'duration':     lambda x: x.max() - x.min(),
    },
    'info':             'last'
}

df.groupby(by='transactionid')['datetime','info'].agg(funcs).reset_index()

print(df)

The expected output should be:

Output:   transactionid           start_date             end_date  duration  info
0              1  2017-04-01 00:00:00  2017-04-01 00:00:02  00:00:02  foo1
1              2  2017-04-01 00:00:02  2017-04-01 00:00:03  00:00:01  foo2
2              3  2017-04-01 00:00:02  2017-04-01 00:00:03  00:00:01  foo3
3              4  2017-04-01 00:00:03  2017-04-01 00:00:04  00:00:01  foo4

Using python3.7 and I'm getting the following error:

Error:python.py:24: FutureWarning: Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.
  df.groupby(by='transactionid')['datetime','info'].agg(funcs).reset_index()
Traceback (most recent call last):
  File "python.py", line 24, in <module>
    df.groupby(by='transactionid')['datetime','info'].agg(funcs).reset_index()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 928, in aggregate
    result, how = self._aggregate(func, *args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/base.py", line 342, in _aggregate
    raise SpecificationError("nested renamer is not supported")
pandas.core.base.SpecificationError: nested renamer is not supported

Any idea to solve this issue?

Thanks a lot

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	pymongo diff type problem to find images on two drives	darter	0	1,693	Mar-11-2021, 04:52 AM Last Post: darter
	Remove extra count columns created by pandas groupby	spyf8	1	3,698	Feb-10-2021, 09:19 AM Last Post: Naheed
	Combine groupby() and shift() in pandas	rama27	0	5,174	Nov-17-2020, 09:49 PM Last Post: rama27
	Pandas + Groupby + Filter unique values	JosepMaria	1	3,541	Jun-15-2020, 08:15 AM Last Post: JosepMaria
	itertuples, datetime, pandas, groupby, in range	karlito	0	2,979	Nov-29-2019, 11:35 AM Last Post: karlito
	Groupby in pandas with conditional - add and subtract	rregorr	2	7,749	Jul-12-2019, 05:17 PM Last Post: rregorr
	Learning indexing with python, pick dates	dervast	1	2,231	Jul-11-2019, 07:29 AM Last Post: scidam
	Pandas segmenting groupby average	brocq_18	0	2,864	Jul-11-2018, 10:54 AM Last Post: brocq_18
	Extract data between two dates from a .csv file using Python 2.7	sujai_banerji	1	11,339	Nov-15-2017, 09:48 PM Last Post: snippsat

python pandas: diff between 2 dates in a groupby

User Panel Messages

Announcements