Python Forum

Full Version: Generalize the topic from a feature
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

I have 2 columns (Description and Type), for each type I want to know what the general topic the type is indicating about.
For example, typeA it is indicating about music instruments.

From the below sample, if there are over thousand of records, how can i use python to identify the topic?

pd.DataFrame({'Description': ['Violin','Dog','UK','Cat','Piano','Guitar','USA'],'Type': ['A','C','B','C','A','A','B']})
Thanks.
Have a mapping (i.e. a dictionary) of the type values to their description strings? Either that or replace the type values with those more descriptive strings. If what you're asking is how to generate that mapping automatically, I suspect that this falls under the realms of natural language processing (NLP). Perhaps NLP libraries include some categorisations of words like this, but I don't know.