Python Forum

Full Version: Problem with number of rows containing null values
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
After cleaning all the rows that has null values, why do I still have a row count that is uneven ?

The data set and my code can be found here: https://drive.google.com/drive/folders/1...sp=sharing

Theses lines:

cursor.execute('SELECT * FROM VG_sale')

query = cursor.fetchall()

rows = pd.DataFrame(query, columns=(['RANK','NAME','PLATFORM','YEAR','GENRE','PUBLISHER','NA_SALES','EU_SALES','JP_SALES','OTHER_SALES','GLOBAL_SALES']))

print(rows.count())
Showed me that the columns: Year, NA_Sales, JP_Sales, EU_Sales don't have 16598 rows...
How did you clean the rows that have null values?
I replaced null values with the median. I also replaced "0" with "0.01":

median = df['Other_Sales'].median()
#print(median) 
df['Other_Sales'].fillna(median, inplace = True)
'SELECT * FROM VG_sale' is a database query isn't it? This select will retrieve the same number of rows unless you 'DELETE' rows. (Or 'INSERT' rows, but you're not talking about that.) 'UPDATING' column values will not change the number of rows.