Python Forum
Grouping in pandas/multi-index data frame
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Grouping in pandas/multi-index data frame
#1
I am trying to group this data frame from CSV file. It is a long data frame with multiple countries under the column COUNTRY and corresponding different party names under the column PTYNAME.



Output:
COUNTRY CTRYID YEAR PTYNAME Finland 7.0 2017 Centre Party Finland 7.0 2017 Finns Party Finland 7.0 2017 National Coalition Party Finland 7.0 2017 Social Democratic Party of Finland Finland 7.0 2017 Green League
What I'd like to do is create a multi-index data frame where I have shown different party names under one country name. something like this below:

Output:
COUNTRY PTYNAME Centre Party Finns Party Finland National Coalition Party Social Democratic Party of Finland Green League
I used the method below:

df1 = df.groupby(['COUNTRY'])['PTYNAME'].sum()

but as a result, all party names get packed against each other in a single row.

Was wondering if anyone has any idea. let me know if I need to clarify anything.
Reply
#2
You can do something like this:
import pandas as pd

df = pd.read_csv("parties.csv")
df = df.groupby(['COUNTRY', 'PTYNAME'])['YEAR'].count()
print(df)
And get something like this:
Output:
COUNTRY PTYNAME Finland Finns Party 1 National Coalition Party 1 Social Democratic Party of Finland 1 Norway Centre Party 1 Green League 1
But this is no longer a dataframe, it is a multiindexed series.
Aleqsie likes this post
Reply
#3
(Jan-05-2024, 06:43 AM)deanhystad Wrote: You can do something like this:
import pandas as pd

df = pd.read_csv("parties.csv")
df = df.groupby(['COUNTRY', 'PTYNAME'])['YEAR'].count()
print(df)
And get something like this:
Output:
COUNTRY PTYNAME Finland Finns Party 1 National Coalition Party 1 Social Democratic Party of Finland 1 Norway Centre Party 1 Green League 1
But this is no longer a dataframe, it is a multiindexed series.

Thanks so much. I'm pretty new to Python. I am going to use the output with other data frame to match the COUNTRY columns to do certain analyses after. Based on what you said, multi-indexed series will be limited regarding manipulation(let's say selection of columns or rows), is that right? Anyways, again thank you for your answer, I got literally what I needed and had spent more than week to figure it out <3
Reply
#4
I think you want to do multi-indexing instead of grouping.
import pandas as pd

df = pd.read_csv("registers.csv")
df = df.set_index(['COUNTRY', 'PTYNAME'])
print(df)
Output:
CTRYID YEAR COUNTRY PTYNAME Norway Centre Party 8.0 2017 Finland Finns Party 7.0 2017 National Coalition Party 7.0 2017 Social Democratic Party of Finland 7.0 2017 Norway Green League 8.0 2017
For a better look, sort the data before making the index.
import pandas as pd

df = pd.read_csv("registers.csv").sort_values(by=["COUNTRY", "PTYNAME"])
df = df.set_index(['COUNTRY', 'PTYNAME'])
print(df)
Output:
CTRYID YEAR COUNTRY PTYNAME Finland Finns Party 7.0 2017 National Coalition Party 7.0 2017 Social Democratic Party of Finland 7.0 2017 Norway Centre Party 8.0 2017 Green League 8.0 2017
Aleqsie likes this post
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Filtering Data Frame, with another value NewBiee 9 1,420 Aug-21-2023, 10:53 AM
Last Post: NewBiee
  Grouping Data based on 30% bracket purnima1 0 962 Feb-16-2023, 07:14 PM
Last Post: purnima1
Smile How to further boost the data read write speed using pandas tjk9501 1 1,276 Nov-14-2022, 01:46 PM
Last Post: jefsummers
  multi index issue of one hot encoder preprocessing aupres 0 1,086 Jun-10-2022, 11:23 AM
Last Post: aupres
Thumbs Up can't access data from URL in pandas/jupyter notebook aaanoushka 1 1,882 Feb-13-2022, 01:19 PM
Last Post: jefsummers
Question Sorting data with pandas TheZaind 4 2,366 Nov-22-2021, 07:33 PM
Last Post: aserian
  Exporting data frame to excel dyerlee91 0 1,638 Oct-05-2021, 11:34 AM
Last Post: dyerlee91
  Pandas Data frame column condition check based on length of the value aditi06 1 2,707 Jul-28-2021, 11:08 AM
Last Post: jefsummers
  Adding a new column to a Panda Data Frame rsherry8 2 2,133 Jun-06-2021, 06:49 PM
Last Post: jefsummers
  [Pandas] Write data to Excel with dot decimals manonB 1 5,916 May-05-2021, 05:28 PM
Last Post: ibreeden

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020