Python Forum
Matrix Search and Calculation Pandas
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Matrix Search and Calculation Pandas
#1
Dear all,

I have problems with a matrix search and calculation. This is my problem.
I have a df1 with different year values for each indicator and country.
I have a df2 with combinations of countries and yearly values.
The desired output is df3 where the product of the indicators fo the countires per year is created and calcualted.

These are the Dataframe:

df1 =pd.DataFrame({'Country':['Armenia','Azerbaidjan','Belarus','Armenia','Azerbaidjan','Belarus'],\
             'Indictaor':['G','G','G','H', 'H', 'H'],'2005':[3,4,5,6,7,4],'2006':[6,3,1,3,5,6]})
 	

[python]
df2 = pd.DataFrame({'Year':[2005,2006,2005,2006],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                    'Country2': ['Belarus','Belarus','Belarus','Belarus']})
df3 = pd.DataFrame({'Year':[2005,2006,2005,2006],                   
                    'Country2': ['Belarus','Belarus','Belarus','Belarus'],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                     'IndictaorGProduct':[15,6,35,5],
                      'IndictaorHProduct':[24,18,28,30]})
I am extremly thankful if someone could help with this problem as this is really confusing.

Thank you.

Kidn regards
Reply
#2
Got it started, the challenge left to you is getting the cell values for the respective countries, G or H, and year and multiplying putting the entries into the last columns in df4, the table that eventually should look like df3.
import pandas as pd
df1 =pd.DataFrame({'Country':['Armenia','Azerbaidjan','Belarus','Armenia','Azerbaidjan','Belarus'],\
             'Indictaor':['G','G','G','H', 'H', 'H'],'2005':[3,4,5,6,7,4],'2006':[6,3,1,3,5,6]})
print(df1)
df2 = pd.DataFrame({'Year':[2005,2006,2005,2006],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                    'Country2': ['Belarus','Belarus','Belarus','Belarus']})
print(df2)
df3 = pd.DataFrame({'Year':[2005,2006,2005,2006],                   
                    'Country2': ['Belarus','Belarus','Belarus','Belarus'],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                     'IndictaorGProduct':[15,6,35,5],
                      'IndictaorHProduct':[24,18,28,30]})
print(df3)
cols = ['Year','Country2','Country1','GProduct','HProduct']
df4 = pd.DataFrame(columns=cols)
df4['Year'] = df2['Year']
df4['Country1'] = df2['Country1']
df4['Country2'] = df2['Country2']
for row in df4 :
    #Get the G and H parameters for the countries and years and multiply
    pass

print(df4)
Output:
Country Indictaor 2005 2006 0 Armenia G 3 6 1 Azerbaidjan G 4 3 2 Belarus G 5 1 3 Armenia H 6 3 4 Azerbaidjan H 7 5 5 Belarus H 4 6 Year Country1 Country2 0 2005 Armenia Belarus 1 2006 Armenia Belarus 2 2005 Azerbaidjan Belarus 3 2006 Azerbaidjan Belarus Year Country2 Country1 IndictaorGProduct IndictaorHProduct 0 2005 Belarus Armenia 15 24 1 2006 Belarus Armenia 6 18 2 2005 Belarus Azerbaidjan 35 28 3 2006 Belarus Azerbaidjan 5 30 Year Country2 Country1 GProduct HProduct 0 2005 Belarus Armenia NaN NaN 1 2006 Belarus Armenia NaN NaN 2 2005 Belarus Azerbaidjan NaN NaN 3 2006 Belarus Azerbaidjan NaN NaN
Reply
#3
Thank you. Creating the columns is the easy part. I can do that also. The calcualtion is the difficult part.
The main Problem is that tha columns heading can not be identified i get an key error.
Reply
#4
There are probably (definitely) better ways to do this, but this works to do the G product anyway. I leave the H to you - should be able to copy the G code in the for loop to do similar for the H data.
import pandas as pd
df1 =pd.DataFrame({'Country':['Armenia','Azerbaidjan','Belarus','Armenia','Azerbaidjan','Belarus'],\
             'Indictaor':['G','G','G','H', 'H', 'H'],'2005':[3,4,5,6,7,4],'2006':[6,3,1,3,5,6]})
df2 = pd.DataFrame({'Year':[2005,2006,2005,2006],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                    'Country2': ['Belarus','Belarus','Belarus','Belarus']})
df3 = pd.DataFrame({'Year':[2005,2006,2005,2006],                   
                    'Country2': ['Belarus','Belarus','Belarus','Belarus'],
                    'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'],
                     'IndictaorGProduct':[15,6,35,5],
                      'IndictaorHProduct':[24,18,28,30]})
cols = ['Year','Country2','Country1','GProduct','HProduct']
df4 = pd.DataFrame(columns=cols)
df4['Year'] = df2['Year']
df4['Country1'] = df2['Country1']
df4['Country2'] = df2['Country2']
gprod = []
for row in df4.iterrows() :
    c1 = row[1][2]
    c2 = row[1][1]
    yr = str(row[1][0])
    g1 = df1.loc[(df1['Country']==c1)&(df1['Indictaor']=='G')]
    g1val = g1[yr].values[0]
    g2 = df1.loc[(df1['Country']==c2)&(df1['Indictaor']=='G')]
    g2val = g2[yr].values[0]
    print(g1val, g2val, g1val*g2val)
    gprod.append(g1val*g2val)
df4['GProduct'] = gprod
print(df4)
Output:
3 5 15 6 1 6 4 5 20 3 1 3 Year Country2 Country1 GProduct HProduct 0 2005 Belarus Armenia 15 NaN 1 2006 Belarus Armenia 6 NaN 2 2005 Belarus Azerbaidjan 20 NaN 3 2006 Belarus Azerbaidjan 3 NaN
Reply
#5
This is excellent stuff. Thank you very much for this help. I had another solution but was still struggling with the date /time finding in the heading. I am no trying that all Indicators (I have many) a new column is added automatically. Any idea on how to make that that dynamic?
Reply
#6
In lines 22 and 24 above the indicator is compared to 'G'. Make that a variable, and you should be able to repeat that process for any indicator.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to search for specific string in Pandas dataframe Coding_Jam 1 2,415 Nov-02-2020, 09:35 AM
Last Post: PsyPy
  Trying to Pass date to pandas search from input prompt curranjohn46 1 2,073 Oct-10-2019, 10:01 AM
Last Post: curranjohn46
  Calculation using group by and pandas sarabi1005 0 2,148 Aug-29-2017, 08:13 PM
Last Post: sarabi1005

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020