Python Forum
Partial Matching Rows In Pandas DataFrame Query
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Partial Matching Rows In Pandas DataFrame Query
#1
Hi there,

I have the following Part of a Python Code, which deletes DataFrame Rows from the last three Urls, if there is matching Data in the first 3 Website Urls,
DataFrame Output. However in certain Rows, there are partial matching words, i.e. in the LOCATION Column. I.e. Texel, with other words in the string, but that row won't be deleted, because it is just Texel in the other DataFrame LOCATION Column Row.

I know I can just drop Rows which I have done allready, or remove commas from DataFrame Rows, earlier on in the Code, which would make filtering the data simpler.

However I was wondering, how to do the following things, how to ignore the Comma after the word, and how to code, if the first 4 letters of the first word, or if I wanted the last 4 letters instead, in the LOCATION Column DataFrame Rows string, match the first word in a row in the other DataFrame, delete those rows from the chosen DataFrame ?

Here is the part of the Code, I mentioned :-

remove=[]
for i,row1 in final_df_.iterrows():
    loc = row1[0]
    date = row1[1]
    for j,row2 in final_df.iterrows():
        if loc in row2[0] and date==row2[1]:
            remove.append(i)
            
final_df_=final_df_.drop(final_df_.index[remove])
final_df_.reset_index(drop=True,inplace=True)
full_df=pd.concat([final_df,final_df_],axis=0)
Any help anyone could give me, would be much appreciated.

Best Regards

Eddie Winch Smile
Reply
#2
I have modified that part of the Code, to the following :-

remove=[]
for i,row1 in final_df_.iterrows():
    loc = row1[0] and row1[0][:4]
   #loc = row1[0] and row1[0][-4:]
    date = row1[1]
    for j,row2 in final_df.iterrows():
         if loc in row2[0] and date==row2[1] or loc in row2[0][:4] and date==row2[1]:
        #if loc in row2[0] and date==row2[1] or loc in row2[0][-4:] and date==row2[1]:   
            remove.append(i)
            
final_df_=final_df_.drop(final_df_.index[remove])
final_df_.reset_index(drop=True,inplace=True)
full_df=pd.concat([final_df,final_df_],axis=0)
This matches the first word in rows in the other DataFrame, where the first 4 letters are the same.

The hashed out lines of code, does the same, but for the last 4 letters of the first word in rows.

Anyone any ideas, how to ignore the comma, at the end of the first word in the rows ?

Regards

Eddie Winch
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Most efficient way to roll through a pandas dataframe? sawtooth500 2 1,057 Aug-28-2024, 10:08 AM
Last Post: Alice12
  docx file to pandas dataframe/excel iitip92 1 2,311 Jun-27-2024, 05:28 AM
Last Post: Pedroski55
  partial functions before knowing the values mikisDeWitte 4 1,574 Dec-24-2023, 10:00 AM
Last Post: perfringo
  Python Alteryx QS-Passing pandas dataframe column inside SQL query where condition sanky1990 0 1,332 Dec-04-2023, 09:48 PM
Last Post: sanky1990
  How is pandas modifying all rows in an assignment - python-newbie question markm74 1 1,399 Nov-28-2023, 10:36 PM
Last Post: deanhystad
  Move Files based on partial Match mohamedsalih12 2 2,633 Sep-20-2023, 07:38 PM
Last Post: snippsat
  Question on pandas.dataframe merging two colums shomikc 4 1,889 Jun-29-2023, 11:30 AM
Last Post: snippsat
  Partial KEY search in dict klatlap 6 2,572 Mar-28-2023, 07:24 AM
Last Post: buran
  Pandas AttributeError: 'DataFrame' object has no attribute 'concat' Sameer33 5 10,035 Feb-17-2023, 06:01 PM
Last Post: Sameer33
  Converting a json file to a dataframe with rows and columns eyavuz21 13 13,295 Jan-29-2023, 03:59 PM
Last Post: eyavuz21

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020