Python Forum
Separating Names & Counting - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Homework (https://python-forum.io/forum-9.html)
+--- Thread: Separating Names & Counting (/thread-38600.html)



Separating Names & Counting - Steven5055 - Nov-03-2022

Hi,
I'm doing a Uni course & I'm a bit stuck & would appreciate some help please.

I have a column in a data frame that contains data as per below (no spacing)
ATTENDEES
John,Jan,Paul
Kylie,Paul,Scott,Jason
Jan,Scott,John

I' trying to seperate the data so I can show who is attending in a list like
ATTENDEES
John
Jan
Paul
Kylie
Scott
Jason

Then be able to count how many are attending
ATTENDEES
John 1
Jan 2
Paul 2
Kylie 1
Scott 2
Jason 1

I've tried splitting the names using
names=masterlist['attendees'].str.split(pat = ',', expand = True)

and counting by using
CountNames=names.groupby(0).count()

but it does not include all the data only the 1st new column from the split
and counting is across all columns.

Thanks in advance
Steve


RE: Separating Names & Counting - Larz60+ - Nov-03-2022

Please show what you have tried, working or not.


RE: Separating Names & Counting - Steven5055 - Nov-03-2022

(Nov-03-2022, 07:32 AM)Larz60+ Wrote: Please show what you have tried, working or not.

import pandas as pd
import os
party = pd.read_csv('party.csv')
CountPeople=party['ATTENDEES'].str.split(pat = ',', expand = True)
CountPeople=CountPeople.groupby(0).count()
party
CountPeople
I've also included an attachment, so you can see the outputs.


RE: Separating Names & Counting - DeaD_EyE - Nov-03-2022

Post deleted. It's in homework. Providing the solution is not so good.


RE: Separating Names & Counting - Steven5055 - Nov-04-2022

OK, so on further reading I've found this post.

how often does a word occur in this column?
https://python-forum.io/thread-10857.html?highlight=genres

If I copy & paste my data into python as per the example it works, but when I try & read it from the CSV file it does not work.

import pandas as pd
import os
party = pd.read_csv('party.csv')
data = party['ATTENDEES']
data_list = data.replace('\n', ',')
data_list = data_list.strip().split(',')
print(Counter(data_list).most_common(15))
Could someone please explain why when I use this it works
data = 'John,Jan,Paul,Kylie,Paul,Scott,Jason,Jan,Scott,John'
BUT
party = pd.read_csv('party.csv')
data = party['ATTENDEES']
Does not work
I'd assume it has something to do with multiple rows or not adding a comma after each row..


RE: Separating Names & Counting - deanhystad - Nov-04-2022

You are trying to treat data_list like a str. It is not a str, it is a series.

Maybe you could join the ATTENDEES together and make a str.


RE: Separating Names & Counting - Steven5055 - Nov-06-2022

It worked, thanks for the guidance.
Was there another way I could have done this?

import pandas as pd
import os
party = pd.read_csv('party.csv')
data = party['ATTENDEES']
list = data.str.cat(sep=',').strip().split(',')
print(Counter(list).most_common())
Output:
[('John', 2), ('Jan', 2), ('Paul', 2), ('Scott', 2), ('Kylie', 1), ('Jason', 1)]

Also is there anyway of listing the names up/down rather than left/right..
Thanks


RE: Separating Names & Counting - deanhystad - Nov-06-2022

I was thinking of using str.join().
import pandas as pd
df = pd.read_csv("party.csv")
names = ",".join(df["ATTENDEES"]).split(",")
print(names)
Output:
['John', 'Jan', 'Paul', 'Kylie', 'Paul', 'Scott', 'Jason', 'Jan', 'Scott', 'John']
Quote:Also is there anyway of listing the names up/down rather than left/right.
Of course, but you'll need to control the printing yourself instead of using the default str representation of a list.

You could use join().
print("\n".join(names))
Output:
John Jan Paul Kylie Paul Scott Jason Jan Scott John
Or you could print a name at a time in a for loop.