The Excel you posted was what I needed!
myApp() below will get what you want, but it is not flexible, because, for Banking , the first step uses SUM($D$2:$D$3)/SUM($E$2:$E$3) 2 times. What if, in a real table you have many rows with Banking?
If Level_3 = Financial_Services we have: SUM($D$4:$D$6)/SUM($E$4:$E$6), 3 times.
Its hard to see the repetitive nature in a large table. Do you have a bigger Excel?
As it stands, this is only useful for these data. But then, I am no expert with Pandas!
I think you could shorten this by using .group_by and you don't really need columns Step1 and Step2, that is just for testing.
import pandas as pd
def myApp():
cols = ['Security', 'CompRating', 'Level_3', 'Level_2', 'parent_weight', 'child_weight']
# revised data
datastring = """A A1 Banking Financial 4.486 3.522
B BBB3 Banking Financial 0.445 0.559
C A1 Financial_Services Financial 0.821 0.994
D A3 Financial_Services Financial 2.744 3.445
E A2 Financial_Services Financial 1.919 NULL
F A1 Insurance Financial 0.393 NULL"""
lists = [s.split() for s in datastring.split('\n')]
df = pd.DataFrame(lists, columns=cols)
# set the number of columns to show
pd.set_option('display.max_columns', 10)
# add a column with a default value of 0 or set all values to 0 if df['Totals'] exists
df['Step1'] = 0
df['Step2'] = 0
df['Totals'] = 0
# the numbers are strings, make them floats
df['parent_weight'] = pd.to_numeric(df.parent_weight, errors='coerce')
df['child_weight'] = pd.to_numeric(df.child_weight, errors='coerce')
for index in df.index:
print('Index is:', index, 'Level_3 is', df['Level_3'][index])
# step 1 make sums for Banking
if df['Level_3'][index] == 'Banking' and pd.notna(df['child_weight'][index]):
sumparent = df['parent_weight'][0:2].sum()
sumchild = df['child_weight'][0:2].sum()
step1 = df['child_weight'][index] * (sumparent/sumchild)
print(f'Level_3 is Banking, Step1 is {step1}')
df.at[index, 'Step1'] = step1
elif df['Level_3'][index] == 'Financial_Services' and pd.notna(df['child_weight'][index]):
sumparent = df['parent_weight'][2:5].sum()
sumchild = df['child_weight'][2:5].sum()
step1 = df['child_weight'][index] * (sumparent/sumchild)
print(f'Level_3 is Financial_Services, Step1 is {step1}')
df.at[index, 'Step1'] = step1
# step 2 case Insurance
elif df['Level_3'][index] == 'Insurance':
sumchild = df['child_weight'].sum()
for jindex in df.index:
if pd.notna(df['child_weight'][jindex]):
step2 = df['child_weight'][jindex] * (df['parent_weight'][5]/sumchild)
print(f'Level_3 is Insurance, Step2 is {step2}')
df.at[jindex, 'Step2'] = step2
for index in df.index:
total = df['Step1'][index] + df['Step2'][index]
df.at[index, 'Totals'] = total
print(f'Sum of parent_weight = {df["parent_weight"].sum()}')
print(f'Sum of Totals = {df["Totals"].sum()}')
myapp() gives this df:
Output:
Security CompRating Level_3 Level_2 parent_weight \
0 A A1 Banking Financial 4.486
1 B BBB3 Banking Financial 0.445
2 C A1 Financial_Services Financial 0.821
3 D A3 Financial_Services Financial 2.744
4 E A2 Financial_Services Financial 1.919
5 F A1 Insurance Financial 0.393
child_weight Step1 Step2 Totals
0 3.522 4.255570 0.162458 4.418029
1 0.559 0.675430 0.025785 0.701215
2 0.994 1.228001 0.045850 1.273851
3 3.445 4.255999 0.158907 4.414906
4 NaN NaN 0.000000 NaN
5 NaN 0.000000 NaN NaN