Python Forum
How to search for specific string in Pandas dataframe
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to search for specific string in Pandas dataframe
#1
Hi, Smile
I'm trying to extract lines from my dataframe using Pandas in a specific column named Equipe_Junior. For now I have ben able to extract my data when asking for the complete string for example: Quebec Remparts [QMJHL]. But I would like to go trough my dataframe for all [QMJHL] or [OHL] or any junior league so I can work stats with that, whithout having to ask for a specific junior team, just the league.

This is my code and results. Thanks for your help.

import pandas as pd
data= pd.read_csv(r'C:\Users\ben\PycharmProjects\draft2020\hockey_draft2012_click_test.csv')
df = pd.DataFrame(data, columns=['Ronde','Equipe','Nom','Equipe_Junior','MJ'])  # choose column from csv
df = df.fillna(0)  # replace nan with 0
select = df.loc[df['Equipe_Junior'] =='Quebec Remparts [QMJHL]']  # select players from that team only
print(select)
Output:
Result Ronde Equipe Nom Equipe_Junior MJ 11 1 Buffalo Mikhail Grigorenko Quebec Remparts [QMJHL] 217.0 123 5 Calgary Ryan Culkin Quebec Remparts [QMJHL] 0.0 165 6 Ottawa Francois Brassard Quebec Remparts [QMJHL] 0.0
Larz60+ write Nov-02-2020, 11:35 AM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.

Fixed for you this time, please use bbcode tags on your future posts. Thank you.
Reply
#2
(Oct-22-2020, 07:19 PM)Coding_Jam Wrote: Hi, Smile
I'm trying to extract lines from my dataframe using Pandas in a specific column named Equipe_Junior. For now I have ben able to extract my data when asking for the complete string for example: Quebec Remparts [QMJHL]. But I would like to go trough my dataframe for all [QMJHL] or [OHL] or any junior league so I can work stats with that, whithout having to ask for a specific junior team, just the league.

This is my code and results. Thanks for your help.

import pandas as pd
data= pd.read_csv(r'C:\Users\ben\PycharmProjects\draft2020\hockey_draft2012_click_test.csv')
df = pd.DataFrame(data, columns=['Ronde','Equipe','Nom','Equipe_Junior','MJ']) # choose column from csv
df = df.fillna(0) # replace nan with 0
select = df.loc[df['Equipe_Junior'] =='Quebec Remparts [QMJHL]'] # select players from that team only
print(select)

Result
Ronde Equipe Nom Equipe_Junior MJ
11 1 Buffalo Mikhail Grigorenko Quebec Remparts [QMJHL] 217.0
123 5 Calgary Ryan Culkin Quebec Remparts [QMJHL] 0.0
165 6 Ottawa Francois Brassard Quebec Remparts [QMJHL] 0.0

Hey! Maybe you could use loc and str.contains? Something like this would select the rows containing either "QMJHL" or "OHL":

df.loc[df.loc[:, 'Equipe_Junior'].str.contains(r'(QMJHL|OHL)')]
In the code above, you would select the rwos containing either of the leagues because you create a boolean mask. Loc will select the rows in the dataframe based on this.

Hope it works!

Best,

E
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  HTML Decoder pandas dataframe column mbrown009 3 1,027 Sep-29-2023, 05:56 PM
Last Post: deanhystad
  Use pandas to obtain cartesian product between a dataframe of int and equations? haihal 0 1,117 Jan-06-2023, 10:53 PM
Last Post: haihal
  Pandas Dataframe Filtering based on rows mvdlm 0 1,430 Apr-02-2022, 06:39 PM
Last Post: mvdlm
  Pandas dataframe: calculate metrics by year mcva 1 2,311 Mar-02-2022, 08:22 AM
Last Post: mcva
  Pandas dataframe comparing anto5 0 1,260 Jan-30-2022, 10:21 AM
Last Post: anto5
  PANDAS: DataFrame | Replace and others questions moduki1 2 1,795 Jan-10-2022, 07:19 PM
Last Post: moduki1
  PANDAS: DataFrame | Saving the wrong value moduki1 0 1,550 Jan-10-2022, 04:42 PM
Last Post: moduki1
  Remove specific values from dataframe jonah88888 0 1,709 Sep-24-2021, 05:09 AM
Last Post: jonah88888
  update values in one dataframe based on another dataframe - Pandas iliasb 2 9,253 Aug-14-2021, 12:38 PM
Last Post: jefsummers
  empty row in pandas dataframe rwahdan 3 2,445 Jun-22-2021, 07:57 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020