Posts: 6
Threads: 2
Joined: Nov 2022
Nov-03-2022, 06:00 AM
(This post was last modified: Nov-03-2022, 06:00 AM by Steven5055.)
Hi,
I'm doing a Uni course & I'm a bit stuck & would appreciate some help please.
I have a column in a data frame that contains data as per below (no spacing)
ATTENDEES
John,Jan,Paul
Kylie,Paul,Scott,Jason
Jan,Scott,John
I' trying to seperate the data so I can show who is attending in a list like
ATTENDEES
John
Jan
Paul
Kylie
Scott
Jason
Then be able to count how many are attending
ATTENDEES
John 1
Jan 2
Paul 2
Kylie 1
Scott 2
Jason 1
I've tried splitting the names using
names=masterlist['attendees'].str.split(pat = ',', expand = True)
and counting by using
CountNames=names.groupby(0).count()
but it does not include all the data only the 1st new column from the split
and counting is across all columns.
Thanks in advance
Steve
Posts: 12,031
Threads: 485
Joined: Sep 2016
Please show what you have tried, working or not.
Posts: 6
Threads: 2
Joined: Nov 2022
Nov-03-2022, 08:20 AM
(This post was last modified: Nov-03-2022, 09:16 PM by Steven5055.)
(Nov-03-2022, 07:32 AM)Larz60+ Wrote: Please show what you have tried, working or not.
import pandas as pd
import os
party = pd.read_csv('party.csv')
CountPeople=party['ATTENDEES'].str.split(pat = ',', expand = True)
CountPeople=CountPeople.groupby(0).count()
party
CountPeople I've also included an attachment, so you can see the outputs.
Attached Files
Thumbnail(s)
party.csv (Size: 105 bytes / Downloads: 103)
Posts: 2,126
Threads: 11
Joined: May 2017
Nov-03-2022, 10:08 AM
(This post was last modified: Nov-03-2022, 10:08 AM by DeaD_EyE.)
Post deleted. It's in homework. Providing the solution is not so good.
Posts: 6
Threads: 2
Joined: Nov 2022
OK, so on further reading I've found this post.
how often does a word occur in this column?
https://python-forum.io/thread-10857.htm...ght=genres
If I copy & paste my data into python as per the example it works, but when I try & read it from the CSV file it does not work.
import pandas as pd
import os
party = pd.read_csv('party.csv')
data = party['ATTENDEES']
data_list = data.replace('\n', ',')
data_list = data_list.strip().split(',')
print(Counter(data_list).most_common(15)) Could someone please explain why when I use this it works
data = 'John,Jan,Paul,Kylie,Paul,Scott,Jason,Jan,Scott,John' BUT
party = pd.read_csv('party.csv')
data = party['ATTENDEES'] Does not work
I'd assume it has something to do with multiple rows or not adding a comma after each row..
Posts: 6,800
Threads: 20
Joined: Feb 2020
You are trying to treat data_list like a str. It is not a str, it is a series.
Maybe you could join the ATTENDEES together and make a str.
Posts: 6
Threads: 2
Joined: Nov 2022
It worked, thanks for the guidance.
Was there another way I could have done this?
import pandas as pd
import os
party = pd.read_csv('party.csv')
data = party['ATTENDEES']
list = data.str.cat(sep=',').strip().split(',')
print(Counter(list).most_common()) Output:
[('John', 2), ('Jan', 2), ('Paul', 2), ('Scott', 2), ('Kylie', 1), ('Jason', 1)]
Also is there anyway of listing the names up/down rather than left/right..
Thanks
Posts: 6,800
Threads: 20
Joined: Feb 2020
Nov-06-2022, 12:52 PM
(This post was last modified: Nov-06-2022, 12:52 PM by deanhystad.)
I was thinking of using str.join().
import pandas as pd
df = pd.read_csv("party.csv")
names = ",".join(df["ATTENDEES"]).split(",")
print(names) Output: ['John', 'Jan', 'Paul', 'Kylie', 'Paul', 'Scott', 'Jason', 'Jan', 'Scott', 'John']
Quote:Also is there anyway of listing the names up/down rather than left/right.
Of course, but you'll need to control the printing yourself instead of using the default str representation of a list.
You could use join().
print("\n".join(names)) Output: John
Jan
Paul
Kylie
Paul
Scott
Jason
Jan
Scott
John
Or you could print a name at a time in a for loop.
|