Python Forum
slice per group - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: slice per group (/thread-19903.html)



slice per group - Progressive - Jul-19-2019

Hi,

I would like to extract the first 50 data points of each group factor in a data frame.

So far, I stumbled over:

grouped = df.groupby('factor').first()
which extracts the first data point (also when it is not a time format as stated in the documentation)

grouped = df.groupby('factor').nth()
which extracts the nth data point, so a single one instead of a list

grouped = df.groupby('factor').apply(lambda x: x.iloc[0:2]))
which extracts the first 50 rows indeed - but only for the first group instead of for all groups..

Can someone please shed some light on me?
Thank you!

I got it. You have to use ".iloc" instead of ".loc"

grouped = df.groupby('factor').apply(lambda x: x.iloc[0:50])



RE: slice per group - scidam - Jul-19-2019

You need to decide where these groups will be stored, in a list, or you want concatenate them into new data-frame?

import pandas as pd
# generate sample data
df = pd.DataFrame({'factor': pd.np.random.choice(range(5), 1000), 'value':pd.np.random.rand(1000)})\

#groups
dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()]



RE: slice per group - Progressive - Jul-19-2019

They are stored in the grouped dataframe?

What do g and v do in
dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()]
?


RE: slice per group - scidam - Jul-20-2019

(Jul-19-2019, 01:50 PM)Progressive Wrote: They are stored in the grouped dataframe?

dfs is a list of data frames of length 50 for each group. g is group name, g = 0, 1, 2, 3, 4.