Python Forum
Drop a row if it contains a certain value in pandas - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: Drop a row if it contains a certain value in pandas (/thread-13952.html)



Drop a row if it contains a certain value in pandas - deltanov - Nov-07-2018

I have this text file:

LEU,LEID,PPP,YYY,LEO
'1','2','3','4','5'
'2','1','2','3','4'
'2','AA','','',''

I want to delete the rows where for LEID='1'

I wrote this code:
import pandas as pd

#import numpy
import os

originalFile=os.path.abspath("D:\\python\\test\\OriginalFile.csv")
df = pd.read_csv(originalFile)
#df.drop(df.LEID == '1')


df=df[df.LEID != '1']

df.to_csv('D:\\python\\test\\CorrectedFile.csv')
print (df)
Why the row with LEID='1' is not delted?


RE: Drop a row if it contains a certain value in pandas - ichabod801 - Nov-08-2018

I would check to see if pd.read_csv is doing a type conversion such that LEID is an int rather than a string. If you want to keep it as a string, you can specify that with the dtype parameter. Otherwise you need to test for an int: df = df[df.LEID != 1].


RE: Drop a row if it contains a certain value in pandas - deltanov - Nov-08-2018

Thanks @ichabod801. Apparently the problem is with the quotes.


RE: Drop a row if it contains a certain value in pandas - Daredevil - Nov-08-2018

import pandas as pd

#import numpy
import os
import csv

originalFile=os.path.abspath("C:\\Users\\monis\\Desktop\\Document1.csv")
df = pd.read_csv(originalFile)
df = df[df.LEID != '1']

df.to_csv('C:\\Users\\monis\\Desktop\\correctedfile.csv')
print (df)

Output
LEU LEID PPP YYY LEO
0 1 2 3.0 4.0 5.0
2 2 'AA' NaN NaN NaN


RE: Drop a row if it contains a certain value in pandas - Daredevil - Nov-10-2018

import pandas as pd
import os
import csv

originalFile=os.path.abspath("C:\\Users\\mo\\Desktop\\Document1.csv")
df = pd.read_csv(originalFile)
df = df[df.LEID != '1']

df.to_csv('C:\\Users\\mo\\Desktop\\correctedfile.csv')
print (df)
Output:
LEU LEID PPP YYY LEO 0 1 2 3.0 4.0 5.0 2 2 'AA' NaN NaN NaN