Python Forum

Full Version: Reading a csv file
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
When I out in some python 3.8.2 code I get the following error

Error:
KeyError Traceback (most recent call last) <ipython-input-2-bfad634aefbd> in <module> 1 from sklearn.model_selection import train_test_split ----> 2 X= wine.drop('quality', axis = 1) 3 y = wine['quality'] 4 X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.2,randowm_state=42) ~\miniconda3\lib\site-packages\pandas\core\frame.py in drop(self, labels, axis, index, columns, level, inplace, errors) 4303 weight 1.0 0.8 4304 """ -> 4305 return super().drop( 4306 labels=labels, 4307 axis=axis, ~\miniconda3\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, index, columns, level, inplace, errors) 4148 for axis, labels in axes.items(): 4149 if labels is not None: -> 4150 obj = obj._drop_axis(labels, axis, level=level, errors=errors) 4151 4152 if inplace: ~\miniconda3\lib\site-packages\pandas\core\generic.py in _drop_axis(self, labels, axis, level, errors) 4183 new_axis = axis.drop(labels, level=level, errors=errors) 4184 else: -> 4185 new_axis = axis.drop(labels, errors=errors) 4186 result = self.reindex(**{axis_name: new_axis}) 4187 ~\miniconda3\lib\site-packages\pandas\core\indexes\base.py in drop(self, labels, errors) 5589 if mask.any(): 5590 if errors != "ignore": -> 5591 raise KeyError(f"{labels[mask]} not found in axis") 5592 indexer = indexer[~mask] 5593 return self.delete(indexer) KeyError: "['quality'] not found in axis"
The code that I am using is

import numpy as np
import pandas as pd
wine = pd.read_csv('wine.csv')
wine.head()



from sklearn.model_selection  import train_test_split
X= wine.drop('quality', axis = 1)
y = wine['quality']
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.2,randowm_state=42)
The output is

Output:
# CSV-File created with merge-csv.com 0 # --------------------------------------------... 1 fixed acidity;"volatile acidity";"citric acid"... 2 7.4;0.7;0;1.9;0.076;11;34;0.9978;3.51;0.56;9.4;5 3 7.8;0.88;0;2.6;0.098;25;67;0.9968;3.2;0.68;9.8;5 4 7.8;0.76;0.04;2.3;0.092;15;54;0.997;3.26;0.6
The quality heading is there, it is the last one the one furthest right. For some reason reading the csv file is not finding it. It is set up to skip the first row because that is the row for the headings.

Since quality heading is there why is the reader for csv missing it and giving me the error you see

Respectfully,

LZ
Hi Led_Zeppelin,

Could you upload the wine.csv File you are using ?

So I can try the Code out myself, and try to get the answer you require ?

Best Regards

Eddie Winch
Try to pass sep=; to pd.read_csv()
Also you will need to skip the first line or maybe first two and maybe specify the column names.
Ideally provide sample file