Result is doutfull- fuzzywuzzy process.extract method - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: Result is doutfull- fuzzywuzzy process.extract method (/thread-11514.html) |
Result is doutfull- fuzzywuzzy process.extract method - klllmmm - Jul-13-2018 I tried this on small test dataset. I wondering why a particular choice word(ie. "mango 2" is not shown in fuzzy match result. It may be due to an error in my for-loops even. Below is my code import pandas as pd from pandas import DataFrame from fuzzywuzzy import fuzz from fuzzywuzzy import processsearch key key = pd.DataFrame(data=['Apple A','apple','AP','Mango','mango'], columns=['Key'])List of string choices to compare against search key choices = pd.DataFrame(data=['Apple 1', 'apple','mango 2', 'Mango','mgo'], columns=['Choice'])List of string choices to compare against search key choices = pd.DataFrame(data=['Apple 1', 'apple','mango 2', 'Mango','mgo'], columns=['Choice']) if __name__ == "__main__": ## Create lookup dictionary by parsing the choices data = {} for row in choices.loc[:,'Choice']: data[row[0]] = row ## For each row in the lookup compute the partial ratio Match_Results1 = pd.DataFrame({'score': [], 'row': [], 'found': []}) for row in key.loc[:,'Key']: for found, score, matchrow in process.extract(row, data,scorer=fuzz.token_set_ratio, limit=100): if score >=0: print('%d%% partial match: "%s" with "%s" ' % (score, row, found)) Match_Results1 = Match_Results1.append({'score': score, 'row': row, 'found': found}, ignore_index=True) print(Match_Results1)Result i'm getting is as follows which does not considered "Mango 2" choice word. found row score 0 apple Apple A 100.0 1 Apple 1 Apple A 86.0 2 Mango Apple A 17.0 3 mgo Apple A 0.0 4 apple apple 100.0 5 Apple 1 apple 100.0 6 Mango apple 20.0 7 mgo apple 0.0 8 apple AP 57.0 9 Apple 1 AP 44.0 10 Mango AP 29.0 11 mgo AP 0.0 12 Mango Mango 100.0 13 mgo Mango 75.0 14 apple Mango 20.0 15 Apple 1 Mango 17.0 16 Mango mango 100.0 17 mgo mango 75.0 18 apple mango 20.0 19 Apple 1 mango 17.0 Appriciate if someone can help.. Thanks |