Jul-13-2018, 01:38 AM
I tried this on small test dataset. I wondering why a particular choice word(ie. "mango 2" is not shown in fuzzy match result. It may be due to an error in my for-loops even. Below is my code
found row score
0 apple Apple A 100.0
1 Apple 1 Apple A 86.0
2 Mango Apple A 17.0
3 mgo Apple A 0.0
4 apple apple 100.0
5 Apple 1 apple 100.0
6 Mango apple 20.0
7 mgo apple 0.0
8 apple AP 57.0
9 Apple 1 AP 44.0
10 Mango AP 29.0
11 mgo AP 0.0
12 Mango Mango 100.0
13 mgo Mango 75.0
14 apple Mango 20.0
15 Apple 1 Mango 17.0
16 Mango mango 100.0
17 mgo mango 75.0
18 apple mango 20.0
19 Apple 1 mango 17.0
Appriciate if someone can help..
Thanks
import pandas as pd from pandas import DataFrame from fuzzywuzzy import fuzz from fuzzywuzzy import processsearch key
key = pd.DataFrame(data=['Apple A','apple','AP','Mango','mango'], columns=['Key'])List of string choices to compare against search
key choices = pd.DataFrame(data=['Apple 1', 'apple','mango 2', 'Mango','mgo'], columns=['Choice'])List of string choices to compare against search key
choices = pd.DataFrame(data=['Apple 1', 'apple','mango 2', 'Mango','mgo'], columns=['Choice'])
if __name__ == "__main__": ## Create lookup dictionary by parsing the choices data = {} for row in choices.loc[:,'Choice']: data[row[0]] = row ## For each row in the lookup compute the partial ratio Match_Results1 = pd.DataFrame({'score': [], 'row': [], 'found': []}) for row in key.loc[:,'Key']: for found, score, matchrow in process.extract(row, data,scorer=fuzz.token_set_ratio, limit=100): if score >=0: print('%d%% partial match: "%s" with "%s" ' % (score, row, found)) Match_Results1 = Match_Results1.append({'score': score, 'row': row, 'found': found}, ignore_index=True) print(Match_Results1)Result i'm getting is as follows which does not considered "Mango 2" choice word.
found row score
0 apple Apple A 100.0
1 Apple 1 Apple A 86.0
2 Mango Apple A 17.0
3 mgo Apple A 0.0
4 apple apple 100.0
5 Apple 1 apple 100.0
6 Mango apple 20.0
7 mgo apple 0.0
8 apple AP 57.0
9 Apple 1 AP 44.0
10 Mango AP 29.0
11 mgo AP 0.0
12 Mango Mango 100.0
13 mgo Mango 75.0
14 apple Mango 20.0
15 Apple 1 Mango 17.0
16 Mango mango 100.0
17 mgo mango 75.0
18 apple mango 20.0
19 Apple 1 mango 17.0
Appriciate if someone can help..
Thanks