(Jun-10-2018, 03:19 PM)Larz60+ Wrote: Use collections:
data = 'Genres\nAction|Adventure\nAction|Adventure|Animation\nAction|Adventure|Animation|Comedy|Drama\n' \ 'Action|Adventure|Animation|Comedy|Family\nAction|Adventure|Animation|Drama|Family\n' \ 'Action|Adventure|Animation|Family\nAction|Adventure|Animation|Family|Fantasy\n' \ 'Action|Adventure|Animation|Family|Mystery\nAction|Adventure|Animation|Family|Science Fiction\n' \ 'Action|Adventure|Animation|Fantasy\nAction|Adventure|Animation|Fantasy|Horror\n' \ 'Action|Adventure|Animation|Fantasy|Science Fiction\n' data_list = data.strip().split('|') print(Counter(data_list).most_common(1))Results:
Output:[('Adventure', 11)]
With
\n
in the middle, the Counter will produce a wrong resultdata_list = data.strip().replace('\n', '|').split('|') print(Counter(data_list).most_common(1))The right result -
Output:[('Action', 12)]
But I have a nagging suspicion that the column
in OP is in DataFrame
Ooops, missed the post above
Test everything in a Python shell (iPython, Azure Notebook, etc.)
- Someone gave you an advice you liked? Test it - maybe the advice was actually bad.
- Someone gave you an advice you think is bad? Test it before arguing - maybe it was good.
- You posted a claim that something you did not test works? Be prepared to eat your hat.