Python Forum

Full Version: How to implement OneHotEncoder for Multiple Categorical Columns?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I am trying to convert the car evaluation dataset from the UCI repository to implement a KNN algorithm on it and I need to first convert the categorical data into numerical values. I know how to convert one column but I am facing difficulty in converting multiple columns. My code snippet is as below (I am very new to Python so this may look very messy and the results I got from below is not what is expected as not all the columns were encoded correctly and I am not sure if I am doing it the right way)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

#importing the dataset
attributes = ['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety']
target = ['acceptability']
dataset = pd.read_csv('car.data',names = attributes+target)
X = dataset.iloc[:,:-1].values
y = dataset.iloc[:,6].values

#handling categorical data
labelencoder_X = LabelEncoder()
X[:,0]=labelencoder_X.fit_transform(X[:,0])
X[:,1]=labelencoder_X.fit_transform(X[:,1])
X[:,2]=labelencoder_X.fit_transform(X[:,2])
X[:,3]=labelencoder_X.fit_transform(X[:,3])
X[:,4]=labelencoder_X.fit_transform(X[:,4])
X[:,5]=labelencoder_X.fit_transform(X[:,5])

#perform dummy encoding to feature scale the data into a standardize format
onehotencoder = OneHotEncoder(categorical_features=[0])
X = onehotencoder.fit_transform(X).toarray()
onehotencoder = OneHotEncoder(categorical_features=[1])
X = onehotencoder.fit_transform(X).toarray()
onehotencoder = OneHotEncoder(categorical_features=[2])
X = onehotencoder.fit_transform(X).toarray()
onehotencoder = OneHotEncoder(categorical_features=[3])
X = onehotencoder.fit_transform(X).toarray()
onehotencoder = OneHotEncoder(categorical_features=[4])
X = onehotencoder.fit_transform(X).toarray()
onehotencoder = OneHotEncoder(categorical_features=[5])
X = onehotencoder.fit_transform(X).toarray()
Any help on this will be much appreciated.