Python Forum
The code I have written removes the desired number of rows, but wrong rows
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
The code I have written removes the desired number of rows, but wrong rows
#1
The Python code I have written removes the intended number of rows from a data frame, but they are not the rows I wish to remove. I am using Python 3.9 on a Windows 10 64-bit OS. I have examined my code intensively and conducted extensive searches of Google and Stack Overflow with no success.

I have attached a copy of my Jupyter Notebook script in the form of screenshots and will reference the lines of code and their corresponding screenshot files throughout this post.

Here is the School Quality Reports dataset from which I am trying to remove the rows, containing 1,238 rows in total (In [4], screenshot_2):

https://data.cityofnewyork.us/Education/.../9cz6-8qpz

I used the following code (In [5], screenshot_3) to generate a subset containing all rows which have null values for the 'Quality Review Rating' column. This code outputs 292 rows with unique values for 'DBN', the column of interest at index position 0:

school_quality_2013_2014_nulls = school_quality_2013_2014[school_quality_2013_2014['Quality Review Rating'].isnull()].copy()
school_quality_2013_2014_nulls
My next step was to determine which of these 292 'DBN' values were located in the two primary datasets I am using - one describing Mathematics examination scores and the other describing English Language Arts (ELA) examination scores:

https://data.cityofnewyork.us/Education/.../gcvr-n8qw (In [2], screenshot_1)
https://data.cityofnewyork.us/Education/.../jk35-yh5p (In [3], screenshot_1)

This determination was made by 1) merging unique 'DBN' values from the two primary datasets into a single data frame and 2) merging this new data frame with the 292-row subset. I used the following code (In [6], screenshot_3):

math_unique_DBN = pd.DataFrame({'DBN':math_exam_2013_2015['DBN'].unique()})
ela_unique_DBN = pd.DataFrame({'DBN':ela_exam_2013_2015['DBN'].unique()})
merged_exam_DBN = pd.merge(math_unique_DBN, ela_unique_DBN, on=['DBN'], how='inner')
merged_exam_nulls = pd.merge(merged_exam_DBN, school_quality_2013_2014_nulls, on=['DBN'], how='inner')
merged_exam_nulls
The code above outputs 152 rows, indicating that 152 of the DBN values from the School Quality Reports dataset are also contained in the two primary datasets. This leaves 140 rows to be removed from the School Quality Reports data frame. To retrieve these 140 rows, I used the following code (In [7], screenshot_3):

school_quality_2013_2014_nulls=school_quality_2013_2014_nulls.reset_index(drop=True)
school_quality_2013_2014_nulls.drop(merged_exam_nulls.index, inplace=True)
school_quality_2013_2014_nulls
The last two rows of output are shown in screenshot_4 (In [8]):

I used the following code (In [9], screenshot_5) to remove the 140 rows:

school_quality_2013_2014.drop(school_quality_2013_2014_nulls.index, inplace=True)
school_quality_2013_2014
The above code reduced the School Quality Reports data frame to 1,098 rows, indicating that 140 rows were removed, but it did not remove the intended rows. For instance, the charter schools in the last rows of the data frame -- DBN beginning with '84' -- should have been removed, but they were not (In [9, 10], screenshot_5).

Please advise and thank you in advance.
Larz60+ write Dec-08-2021, 09:16 AM:
Please post all code, output and errors (it it's entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Code as an image cannot be cut and pasted, and will dissuade many users from responding.

Attached Files

Thumbnail(s)
                   
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  This result object does not return rows. It has been closed automatically dawid294 6 1,008 Mar-30-2024, 03:08 AM
Last Post: NolaCuriel
  How is pandas modifying all rows in an assignment - python-newbie question markm74 1 691 Nov-28-2023, 10:36 PM
Last Post: deanhystad
  I have a code which is very simple but still I cannot detect what's wrong with it max22 1 480 Nov-07-2023, 04:32 PM
Last Post: snippsat
  How to insert Dashed Lines in between Rows of a tabulate output Mudassir1987 0 496 Sep-27-2023, 10:09 AM
Last Post: Mudassir1987
  how do you style data frame that has empty rows. gsaray101 0 524 Sep-08-2023, 05:20 PM
Last Post: gsaray101
  Sequential number for rows retrieved and storing the Primary UKey to the line number GYKR 2 575 Aug-22-2023, 10:14 AM
Last Post: GYKR
  Something wrong with my code FabianPruitt 5 849 Jul-03-2023, 10:55 PM
Last Post: Pedroski55
  Rows not adding to sqlite3 database using SQLAlchemy Calab 11 1,656 Jun-02-2023, 05:53 PM
Last Post: bowlofred
  Compiles Python code with no error but giving out no output - what's wrong with it? pythonflea 6 1,555 Mar-27-2023, 07:38 AM
Last Post: buran
  python script for inserting rows into hbase table lravikumarvsp 7 7,099 Mar-24-2023, 04:44 AM
Last Post: parth_botadara

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020