Python Forum
Check text contains words similar to themes/topics (thesaurus) - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Check text contains words similar to themes/topics (thesaurus) (/thread-28649.html)



Check text contains words similar to themes/topics (thesaurus) - Bec - Jul-28-2020

I'm a beginner in python and am needing some help (either code, or point me in the right direction for help).

I have some free text fields where i need to assess if they reach a benchmark of completion. ie sales people needing to complete a free text field based on customer's needs - I am needing to flag these as "pass" or "not pass" for training purposes - ie not pass is where people are lazy and put N/A or junk text, or, where they don't mention something about the topic required. Pass would need to contain relevant info relating to the field topic/theme.

I'm getting some examples of what are considered 'good' completions. I'm thinking of just basing the first version around length and presence of keywords - ie is the text long enough (ie needs to be a min of 5 words) - this part is fine, the part i need help with is trying to work out how to determine are any key words listed in the text that match the topic/theme. Think

ie for the field is based on the customer's current situation i might be looking for any words related to situation, current, help, doing, needs etc ... for the field based on next steps i might look for words relating to going, decision, recommend, choose

Any ideas/help would be appreciated! Pray


RE: Check text contains words similar to themes/topics (thesaurus) - Larz60+ - Jul-28-2020

One of these packages will probably do the trick: https://pypi.org/search/?q=synonym&o=
I leave it up to you to choose.

Also, depending on how much you want to get into the subject, there is a package that will do this as a small part of what it's capable of: NLTK -- the Natural Language Tool Kit.
Not trivial at all to learn entire package, but to use for a simple synonym producer, might not be so bad, see: https://www.w3resource.com/python-exercises/nltk/nltk-corpus-exercise-7.php

NLTK package itself is here: https://www.nltk.org/