Python Forum
error - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: Data Science (https://python-forum.io/forum-44.html)
+--- Thread: error (/thread-10410.html)



giving error could not convert string to float - lovedeep - May-19-2018

import sklearn
import pandas as pd
import numpy as np
from sklearn.utils import shuffle
from sklearn.linear_model import LinearRegression


#import the read set
def read_dataset():
    df = pd.read_csv("C:\\Users\\BADSHAH\\PycharmProjects\\edureka1.csv")
    x = df[df.columns[0:4]].values
    y = df[df.columns[4]]
    # to convert categorical data to numerical
    obj_df = df.select_dtypes(include=['object']).copy()
    obj_df["State"].value_counts()
    cleanup_nums = {"State": {'California': 2 , 'New York': 1 , 'Florida': 3}}  # type:
    obj_df.replace(cleanup_nums, inplace=True)

    print(x.shape)
    return(x,y)


# read data set
x,y = read_dataset()

# shuffle the dataset
x,y = shuffle(x,y, random_state=1)

#break data into test and train part
x_train,x_test,y_train,y_test = sklearn.model_selection.train_test_split(x, y, test_size=0.20, random_state=5)

print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

lm = LinearRegression()

lm.fit(x_train,y_train)

y_train_predict = lm.predict(x_train)
y_test_predict = lm.predict(x_test)
cf=pd.DataFrame(y_test_predict,x_test)
print(cf)



RE: giving error could not convert string to float - j.crater - May-19-2018

The error message should give you enough information to know which line of code is problematic and what to change.
Anyhow, if you'd like our help, post full error traceback message in error tags.


error - lovedeep - May-19-2018

Error:
C:\Users\BADSHAH\PycharmProjects\thesis\venv\Scripts\python.exe C:/Users/BADSHAH/PycharmProjects/thesis/krish (50, 4) Traceback (most recent call last): (40, 4) File "C:/Users/BADSHAH/PycharmProjects/thesis/krish", line 41, in <module> lm.fit(x_train,y_train) (10, 4) File "C:\Users\BADSHAH\PycharmProjects\thesis\venv\lib\site-packages\sklearn\linear_model\base.py", line 482, in fit (40,) y_numeric=True, multi_output=True) (10,) File "C:\Users\BADSHAH\PycharmProjects\thesis\venv\lib\site-packages\sklearn\utils\validation.py", line 573, in check_X_y ensure_min_features, warn_on_dtype, estimator) File "C:\Users\BADSHAH\PycharmProjects\thesis\venv\lib\site-packages\sklearn\utils\validation.py", line 433, in check_array array = np.array(array, dtype=dtype, order=order, copy=copy) ValueError: could not convert string to float: 'California' Process finished with exit code 1



RE: error - j.crater - May-19-2018

ValueError: could not convert string to float: 'California'

This line pretty much says it all. You are using string "California" where the code expects a float value.