Python Forum
sort search results by similarity of characters - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: sort search results by similarity of characters (/thread-41608.html)



sort search results by similarity of characters - jacksfrustration - Feb-16-2024

i have a stock market app that im building. basically i have the following code that works as a feature that allows the user to search for official company name and its corresponding symbol in order to view news about the company or view the last few days's stock market value


    search_results=[data["name"] for data in DATA_DICT if search_term in data["name"].lower()]
i would like to sort these results by similarity so for instance if the input was spam and the company names were ham,spam, spammy it would sort the list as spam, spammy, ham


RE: sort search results by similarity of characters - menator01 - Feb-16-2024

You can have a look at sort and sorted.


RE: sort search results by similarity of characters - jacksfrustration - Feb-16-2024

(Feb-16-2024, 08:48 PM)menator01 Wrote: You can have a look at sort and sorted.

won't that return sorted list alphabetically? i dont need it to be alphabetical but similarity to company name.. maybe i can add points for each matching character in the string and create a list of dictionaries with name and value of matching characters? but the how would i sort them based on those values? maybe some kind of sorcery using the insert method?


RE: sort search results by similarity of characters - menator01 - Feb-16-2024

Can you give an example of the data structure?


RE: sort search results by similarity of characters - jacksfrustration - Feb-16-2024

(Feb-16-2024, 09:04 PM)menator01 Wrote: Can you give an example of the data structure?
i have a DATA_DICT created at the top of my script which has the corresponding names and symbols of all the companies within a json file. i then use search list comprehension like


results=[data["name"] for data in DATA_DICT if search_term in data["name"].lower()]
i then want to sort the resulting list by similarity to company name and using a for loop and messageboxes from tkinter i pop up the search result one at a time and change the company_name Entry box test to the appropriate company name when the user clicks ok. i iterate through the list until i run out of search results


RE: sort search results by similarity of characters - deanhystad - Feb-16-2024

I think sort is the wrong term. I think you want to filter your results based on similarity.

the easiest way to do this is only include values have a matching substring in name. From your first example
Quote: if the input was spam and the company names were ham,spam, spammy it would sort the list as spam, spammy, ham
Entering spam would match spam and spammy, but not ham. It would also match "I love spam" and "I am not a spammer".

Going from looking for names that contain a substring to similar words is a huge jump in complexity. You should start by looking at the built-in library difflib, in paticular the SequenceMatcher.

If difflib doesn't provide results you like, you might want to look at fuzzywuzzy thad does a different kind of processing.