Python Forum
[SOLVED on SO] Downsizing non-representative data in DataFrame
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[SOLVED on SO] Downsizing non-representative data in DataFrame
#1
Hi, folks,
I occasionally dabble in pandas - but I cannot claim deep knowledge. Today I had to filter out some rows from a DataFrame based on occurence of a value in a certain column. As in this example
Output:
In [57]: table = pd.DataFrame([[2, 'a'], [3, 'b'], [2, 'c'], [4, 'd'], [4, 'e'], [5, 'f']], ...: columns=('group', 'letter')) ...: print(table) ...: group letter 0 2 a 1 3 b 2 2 c 3 4 d 4 4 e 5 5 f
I want to remove all rows where a value in the group column appears only once.

I hacked around the problem by this inellegant solution
Output:
In [58]: pd.concat(df for _, df in table.groupby(by=['group']) if len(df) > 1) Out[58]: group letter 0 2 a 2 2 c 3 4 d 4 4 e
But I bet there are proper ways to achieve the same goal.

Anyone can suggest a more pandaic Tongue solution?!

Thanks in advance
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply
#2
If anyone is interested - with trepidation, I posted this question on SO (those in the know will understand my reluctance), and - got an answer, without being hassled for the whole half an hour and counting Dance
Test everything in a Python shell (iPython, Azure Notebook, etc.)
  • Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
  • Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
  • You posted a claim that something you did not test works? Be prepared to eat your hat.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  [solved] how to speed-up huge data in an ascii file ? paul18fr 4 1,209 May-16-2023, 08:36 PM
Last Post: paul18fr
  How to insert data in a dataframe? man0s 1 1,312 Apr-26-2022, 11:36 PM
Last Post: jefsummers
Question [Solved] How to refer to dataframe column name based on a list lorensa74 1 2,239 May-17-2021, 07:02 AM
Last Post: lorensa74
  Filter data based on a value from another dataframe column and create a file using lo pawanmtm 1 4,245 Jul-15-2020, 06:20 PM
Last Post: pawanmtm
  datetime intervals - dataframe selection (via plotted data) karlito 0 1,681 Nov-12-2019, 08:16 AM
Last Post: karlito
  How to add data to the categorical index of dataframe as data arrives? AlekseyPython 1 2,316 Oct-16-2019, 06:26 AM
Last Post: AlekseyPython
  Inserting data from python list into a pandas dataframe mahmoud899 0 2,581 Mar-02-2019, 04:07 AM
Last Post: mahmoud899
  Pandas nested json data to dataframe FrankC 1 10,120 Aug-14-2018, 01:37 AM
Last Post: scidam
  Trying to import JSON data into Python/Pandas DataFrame then edit then write CSV Rhubear 0 4,075 Jul-23-2018, 09:50 PM
Last Post: Rhubear

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020