![]() |
Matrix Search and Calculation Pandas - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Matrix Search and Calculation Pandas (/thread-23866.html) |
Matrix Search and Calculation Pandas - luxlambo227 - Jan-21-2020 Dear all, I have problems with a matrix search and calculation. This is my problem. I have a df1 with different year values for each indicator and country. I have a df2 with combinations of countries and yearly values. The desired output is df3 where the product of the indicators fo the countires per year is created and calcualted. These are the Dataframe: df1 =pd.DataFrame({'Country':['Armenia','Azerbaidjan','Belarus','Armenia','Azerbaidjan','Belarus'],\ 'Indictaor':['G','G','G','H', 'H', 'H'],'2005':[3,4,5,6,7,4],'2006':[6,3,1,3,5,6]}) [python] df2 = pd.DataFrame({'Year':[2005,2006,2005,2006], 'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'], 'Country2': ['Belarus','Belarus','Belarus','Belarus']}) df3 = pd.DataFrame({'Year':[2005,2006,2005,2006], 'Country2': ['Belarus','Belarus','Belarus','Belarus'], 'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'], 'IndictaorGProduct':[15,6,35,5], 'IndictaorHProduct':[24,18,28,30]})I am extremly thankful if someone could help with this problem as this is really confusing. Thank you. Kidn regards RE: Matrix Search and Calculation Pandas - jefsummers - Jan-21-2020 Got it started, the challenge left to you is getting the cell values for the respective countries, G or H, and year and multiplying putting the entries into the last columns in df4, the table that eventually should look like df3. import pandas as pd df1 =pd.DataFrame({'Country':['Armenia','Azerbaidjan','Belarus','Armenia','Azerbaidjan','Belarus'],\ 'Indictaor':['G','G','G','H', 'H', 'H'],'2005':[3,4,5,6,7,4],'2006':[6,3,1,3,5,6]}) print(df1) df2 = pd.DataFrame({'Year':[2005,2006,2005,2006], 'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'], 'Country2': ['Belarus','Belarus','Belarus','Belarus']}) print(df2) df3 = pd.DataFrame({'Year':[2005,2006,2005,2006], 'Country2': ['Belarus','Belarus','Belarus','Belarus'], 'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'], 'IndictaorGProduct':[15,6,35,5], 'IndictaorHProduct':[24,18,28,30]}) print(df3) cols = ['Year','Country2','Country1','GProduct','HProduct'] df4 = pd.DataFrame(columns=cols) df4['Year'] = df2['Year'] df4['Country1'] = df2['Country1'] df4['Country2'] = df2['Country2'] for row in df4 : #Get the G and H parameters for the countries and years and multiply pass print(df4)
RE: Matrix Search and Calculation Pandas - luxlambo227 - Jan-22-2020 Thank you. Creating the columns is the easy part. I can do that also. The calcualtion is the difficult part. The main Problem is that tha columns heading can not be identified i get an key error. RE: Matrix Search and Calculation Pandas - jefsummers - Jan-22-2020 There are probably (definitely) better ways to do this, but this works to do the G product anyway. I leave the H to you - should be able to copy the G code in the for loop to do similar for the H data. import pandas as pd df1 =pd.DataFrame({'Country':['Armenia','Azerbaidjan','Belarus','Armenia','Azerbaidjan','Belarus'],\ 'Indictaor':['G','G','G','H', 'H', 'H'],'2005':[3,4,5,6,7,4],'2006':[6,3,1,3,5,6]}) df2 = pd.DataFrame({'Year':[2005,2006,2005,2006], 'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'], 'Country2': ['Belarus','Belarus','Belarus','Belarus']}) df3 = pd.DataFrame({'Year':[2005,2006,2005,2006], 'Country2': ['Belarus','Belarus','Belarus','Belarus'], 'Country1':['Armenia','Armenia','Azerbaidjan','Azerbaidjan'], 'IndictaorGProduct':[15,6,35,5], 'IndictaorHProduct':[24,18,28,30]}) cols = ['Year','Country2','Country1','GProduct','HProduct'] df4 = pd.DataFrame(columns=cols) df4['Year'] = df2['Year'] df4['Country1'] = df2['Country1'] df4['Country2'] = df2['Country2'] gprod = [] for row in df4.iterrows() : c1 = row[1][2] c2 = row[1][1] yr = str(row[1][0]) g1 = df1.loc[(df1['Country']==c1)&(df1['Indictaor']=='G')] g1val = g1[yr].values[0] g2 = df1.loc[(df1['Country']==c2)&(df1['Indictaor']=='G')] g2val = g2[yr].values[0] print(g1val, g2val, g1val*g2val) gprod.append(g1val*g2val) df4['GProduct'] = gprod print(df4)
RE: Matrix Search and Calculation Pandas - luxlambo227 - Jan-27-2020 This is excellent stuff. Thank you very much for this help. I had another solution but was still struggling with the date /time finding in the heading. I am no trying that all Indicators (I have many) a new column is added automatically. Any idea on how to make that that dynamic? RE: Matrix Search and Calculation Pandas - jefsummers - Jan-27-2020 In lines 22 and 24 above the indicator is compared to 'G'. Make that a variable, and you should be able to repeat that process for any indicator. |