Python Forum
matching with SequenceMatcher ratio two dataframe
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
matching with SequenceMatcher ratio two dataframe
#1
Hello,

I use the SequenceMatcher ratio to match two dataframe with the best ratio.

I want to check first if the score A and AA is good then check if the score between B is BB is good then if the score between C and CC is good, then I add the line

i have 2 datafrme: df1
        A     B     C
0    pizza    ze    3
1    polo     fe    5
2    ninja    fi    NaN
and df2:
     AA      BB      CC
0    za      ze      NaN
1    po      ka       8
2    fe      fe       6
3    pizza   fi       3
4    polo    ko       5
5    ninja   3        pizza
I tried this function, but it doesn't work:
from difflib import SequenceMatcher
def similar(a, b):
    ratio = SequenceMatcher(None, a, b).ratio()
    return ratio
order = []
score = []
for index, row in df1.iterrows():
    maxima = [similar(row['A'], j) for j in df2['AA']]
    best_ratio = max(maxima)
    if best_ratio > 0.9:     
        maxima2 = [similar(row['B'], j) for j in df2['BB']]
        best_ratio2 = max(maxima2)
        if best_ratio2 > 0.9:
           maxima3 = [similar(row['C'], j) for j in 
                      df2['CC']]
           best_ratio = max(maxima3)
           best_row = np.argmax(maxima3)
           order.append(best_row)
df2 = df2.iloc[order].reset_index()
merge = pd.concat([df1, df2], axis=1)
i want dataframe like this:
      A        B         C       AA          BB     CC      score
0    pizza    ze         3        pizza       ze      3      100
1    polo     fe         5        polo        ko      5       75
2    ninja    fi        NaN       ninja       3      pizza    30
Reply
#2
Quote:I tried this function, but it doesn't work

What about it doesn't work? Are there errors? What are they?
Or is the output not what you expected? What was the output, and what did you expect?
Reply
#3
(Feb-11-2021, 09:58 PM)nilamo Wrote:
Quote:I tried this function, but it doesn't work

What about it doesn't work? Are there errors? What are they?
Or is the output not what you expected? What was the output, and what did you expect?
👌🆗️
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How do I calculate a ratio from 2 numbers and return an equivalent list of about 1000 Pleiades 8 15,715 Jan-05-2024, 08:30 PM
Last Post: sgrey
  Partial Matching Rows In Pandas DataFrame Query eddywinch82 1 2,381 Jul-08-2021, 06:32 PM
Last Post: eddywinch82
  Stumped by my own code (ratio & epoch-time calculation). MvGulik 2 2,148 Dec-30-2020, 12:04 AM
Last Post: MvGulik
  [difflib] read files with SequenceMatcher JamieVanCadsand 3 4,731 Sep-15-2017, 09:15 AM
Last Post: JamieVanCadsand

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020