Python Forum

Full Version: pandas Dataframe as "confidence table" for matchmaking?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

I wonder if I am on the right track and would like to get your input on my problem:

Goal:
Find a link between two rows in two tables based on a number of criteria. 

Approach:
I want to work with a "match score or "confidence level" to determine, based on all my match criteria, which row in table 2 is most likely related to table 1.
In order to keep track of the "match score " I figured a dataframe with the unique row identifiers of table 1 and 2 as index and column would enable me to perform all my match criteria and constantly update the corresponding "match score" in the dataframe .

Question:

The problem I am having is that my way of updating the dataframe is not being saved.

I made a simple example to test my dataframe question. In the example below I point to the intersection of the "match score" that needs to be updated, and update the score, but for the next match the score is again updated from the original value of 0, therefore giving me an end result of 10 instead of my desired 50.

import pandas as pd
import numpy as np

table_1 = ('s1','s2','s3','s4','s5')
table_2 = ('i1','i2','i3','i4','i5')


df = pd.DataFrame(index = table_1, columns = table_2)
df = df.fillna(0)

for s in table_1:
df2= df.loc['s3','i4'] =+ 10




print(df)
Output:
    i1  i2  i3  i4  i5 s1   0   0   0   0   0 s2   0   0   0   0   0 s3   0   0   0  10   0 s4   0   0   0   0   0 s5   0   0   0   0   0
Do you know how I can save my change to the dataframe?

Also if you have any other conceptual suggestions on how I approach my goal I am happy to hear.
I did find an answer to the question posted. 



for s in table_1:
    df1= df.loc['s3','i4']
    df2 = df.set_value('s3', 'i4', df1 +10, takeable=False)
If somebody has any suggestion regarding the approach I am taking I am happy to hear