Mar-07-2017, 07:15 PM
Yes, there are functions to do it "more directly". Common way is to convert given dataframe to a "narrow" format, where column names become one variable, after that do some transformation and convert it back to a "wide" format, where content of one (or more) column gives new column names. It has some similarity with Excel pivot tables.
There is
Unfortunately even with those functions it could look rather ugly:
There is
pandas.melt()
, that can be used to convert dataframe from wide to narrow format, and .pivot()
or.unstack()
can be used to convert narrow to wide (there are other functions too).Unfortunately even with those functions it could look rather ugly:
Output:In [214]: data = {"state":["New York", "California"], "region":["New York", "Los Angeles"], "2001-01":[123,345], "2001-02":[343,132], "2001-03":[63,423], "2001-04":[393,42]}
In [215]: df = pd.DataFrame(data, columns=["state", "region", "2001-01", "2001-02", "2001-03", "2001-04"])
In [216]: df
Out[216]:
state region 2001-01 2001-02 2001-03 2001-04
0 New York New York 123 343 63 393
1 California Los Angeles 345 132 423 42
In [217]: wide = df.drop(['state', 'region'], axis=1).reset_index()
...: melted = pd.melt(wide, id_vars='index')
...: melted.variable = melted.variable.apply(lambda x : "{}q{}".format(x[:4], (int(x[5:]) - 1) // 3 + 1))
...: grouped = melted.groupby(['index','variable']).sum().unstack()
...: grouped.columns = grouped.columns.get_level_values(1)
...: df[['state', 'region']].join(grouped)
...:
Out[217]:
state region 2001q1 2001q2
0 New York New York 529 393
1 California Los Angeles 900 42
In reality it is not so bad:- cut part to modify
- convert it to narrow
- aggregate
- convert to wide
- join with remaining part