slice per group - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: slice per group (/thread-19903.html) |
slice per group - Progressive - Jul-19-2019 Hi, I would like to extract the first 50 data points of each group factor in a data frame. So far, I stumbled over: grouped = df.groupby('factor').first()which extracts the first data point (also when it is not a time format as stated in the documentation) grouped = df.groupby('factor').nth()which extracts the nth data point, so a single one instead of a list grouped = df.groupby('factor').apply(lambda x: x.iloc[0:2]))which extracts the first 50 rows indeed - but only for the first group instead of for all groups.. Can someone please shed some light on me? Thank you! I got it. You have to use ".iloc" instead of ".loc" grouped = df.groupby('factor').apply(lambda x: x.iloc[0:50]) RE: slice per group - scidam - Jul-19-2019 You need to decide where these groups will be stored, in a list, or you want concatenate them into new data-frame? import pandas as pd # generate sample data df = pd.DataFrame({'factor': pd.np.random.choice(range(5), 1000), 'value':pd.np.random.rand(1000)})\ #groups dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()] RE: slice per group - Progressive - Jul-19-2019 They are stored in the grouped dataframe? What do g and v do in dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()]? RE: slice per group - scidam - Jul-20-2019 (Jul-19-2019, 01:50 PM)Progressive Wrote: They are stored in the grouped dataframe? dfs is a list of data frames of length 50 for each group. g is group name, g = 0, 1, 2, 3, 4.
|