Sum product multiple Dataframes based on column headers. - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Sum product multiple Dataframes based on column headers. (/thread-18534.html) |
Sum product multiple Dataframes based on column headers. - Lastwizzle - May-21-2019 I have a data frame with inputs where each group represents multiple inputs values that correspond to that groups respective premium ( script for test data below) Each value in that row is a key to a table which contains factors for multiple perils (e.g fire, Water Weather). What I want to do is create a premium column which is the sum product of each perils factors. Example output in premium_output_df. input_df is an example of what my start data looks like albeit much smaller and with less column fields. amount_of_insurance_df is a representation of the table which contains the peril factors with each groups corresponding Amount of Insurance value. territory df is a representation of the table which contains the peril territory factors. To run through an example, the Premium for group 1 will be the sum product of the perils. (0.1*0.94)+(0.2*0.3)=0.154=Total Premium group1 To give you an idea of the process I used to collect the data. Each column in input_df represents a table that gets merged to come collect the factors for each peril, please see below for an example of a merge being done for the actual ( not raw) data set. amount_of_insurance=pd.merge(data_for_rater["Enter Amount of Insurance"],df['Amount of Insurance'],how='left',left_on='Enter Amount of Insurance' , right_on='Amount Of Insurance') My original thought was to concat each side by side, but it was proving to be ineffective due to the sheet amount of tables ( 47). Thoughts on a better way to proceed? Test Data input_df=pd.DataFrame(input_data) amount_of_insurance_df=pd.DataFrame(data_amount_of_insurance) territory_df=pd.DataFrame(data_territory) premium_output_data={'Group':[1,2,3,4,5],'Amount of Insurance':[28000,28000,28000,28000,28000],'Territory':[701,702,703,704,705],'Fire Premium':[0.094,0.19,0.288,0.388,0.49],'Water Weather Premium':[0.06,0.12,0.2,0.3,0.42],'Total Premium':[0.154,0.31,0.488,0.688,0.91]} premium_output_df=pd.DataFrame(premium_output_data) |