Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 slice per group
#1
Hi,

I would like to extract the first 50 data points of each group factor in a data frame.

So far, I stumbled over:

grouped = df.groupby('factor').first()
which extracts the first data point (also when it is not a time format as stated in the documentation)

grouped = df.groupby('factor').nth()
which extracts the nth data point, so a single one instead of a list

grouped = df.groupby('factor').apply(lambda x: x.iloc[0:2]))
which extracts the first 50 rows indeed - but only for the first group instead of for all groups..

Can someone please shed some light on me?
Thank you!

I got it. You have to use ".iloc" instead of ".loc"

grouped = df.groupby('factor').apply(lambda x: x.iloc[0:50])
Quote
#2
You need to decide where these groups will be stored, in a list, or you want concatenate them into new data-frame?

import pandas as pd
# generate sample data
df = pd.DataFrame({'factor': pd.np.random.choice(range(5), 1000), 'value':pd.np.random.rand(1000)})\

#groups
dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()]
Quote
#3
They are stored in the grouped dataframe?

What do g and v do in
dfs = [df.loc[v[:50]] for g, v in df.groupby('factor').groups.items()]
?
Quote
#4
(Jul-19-2019, 01:50 PM)Progressive Wrote: They are stored in the grouped dataframe?

dfs is a list of data frames of length 50 for each group. g is group name, g = 0, 1, 2, 3, 4.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  TypeError: '(slice(None, None, None), 0)' is an invalid key zaki424160 1 697 Jul-17-2019, 11:53 PM
Last Post: scidam
  Melt or Slice Grin 0 470 Jun-24-2018, 06:02 PM
Last Post: Grin
  How to group variables & check correlation of group variables wrt single variable SriRajesh 2 635 May-23-2018, 03:01 PM
Last Post: SriRajesh

Forum Jump:


Users browsing this thread: 1 Guest(s)