Python Forum
Simple pandas dataframe question - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Simple pandas dataframe question (/thread-15019.html)



Simple pandas dataframe question - popohoma - Dec-30-2018

Hi all,

I am new to pandas and have a real simple question for all of you but it will make my day to day presentation more intuitive.

I am trying to merge few common key value into one (please see below picture).
[Image: Screen-Shot-2018-12-30-at-9-59-18-PM.png]

This is similar to excel function "Merge and Centre" where values in different cells can be combined into 1 single cell/value. I know how to do it while concating multiple dataframes but in my case it is a single one.

Do you have any clues?

Thanks,

Allen


RE: Simple pandas dataframe question - ashlardev - Jan-03-2019

The below code will do exactly what you seek.

To explain it, I created 3 series and assigned them to before_data.

I assigned the before_data as the data for a DataFrame called before.

I assigned the now grouped by (Country and Index) DataFrame before to after and used .sum() to aggregate as there is technically no summing going on here in these values.

I then applied a sort on the Shares column so that numerically it matched the original dataframe.

import pandas as pd
import numpy as np

before_data = {'Country' : pd.Series(['Australia','Australia', 'Japan', 'Japan','Japan','Japan', 'Hong Kong', 'Hong Kong', 'Hong Kong'], index=range(9)),
               'Index' : pd.Series(["ASX 200","MSCI Australia","N225","Topix","Mother","MSCI Japan", "HSI", "HSCEI", "MSCI Hong Kong"]),
               'Shares' : pd.Series([10,20,30,40,50,60,70,80,90])}

before = pd.DataFrame(before_data)
print(before)
after = before.groupby(['Country','Index']).sum()
after = after.sort_values(by=['Shares'])
print(after)