Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 slice per group

I would like to extract the first 50 data points of each group factor in a data frame.

So far, I stumbled over:

grouped = df.groupby('factor').first()
which extracts the first data point (also when it is not a time format as stated in the documentation)

grouped = df.groupby('factor').nth()
which extracts the nth data point, so a single one instead of a list

grouped = df.groupby('factor').apply(lambda x: x.iloc[0:2]))
which extracts the first 50 rows indeed - but only for the first group instead of for all groups..

Can someone please shed some light on me?
Thank you!

I got it. You have to use ".iloc" instead of ".loc"

grouped = df.groupby('factor').apply(lambda x: x.iloc[0:50])
You need to decide where these groups will be stored, in a list, or you want concatenate them into new data-frame?

import pandas as pd
# generate sample data
df = pd.DataFrame({'factor':, 1000), 'value'})\

dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()]
They are stored in the grouped dataframe?

What do g and v do in
dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()]
(Jul-19-2019, 01:50 PM)Progressive Wrote: They are stored in the grouped dataframe?

dfs is a list of data frames of length 50 for each group. g is group name, g = 0, 1, 2, 3, 4.

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  TypeError: '(slice(None, None, None), 0)' is an invalid key zaki424160 1 3,266 Jul-17-2019, 11:53 PM
Last Post: scidam
  Melt or Slice Grin 0 614 Jun-24-2018, 06:02 PM
Last Post: Grin
  How to group variables & check correlation of group variables wrt single variable SriRajesh 2 774 May-23-2018, 03:01 PM
Last Post: SriRajesh

Forum Jump:

Users browsing this thread: 1 Guest(s)