ValueError: could not convert string to float: Close??

BlackHeart · (This post was last modified: Oct-27-2017, 12:09 PM by BlackHeart.)

Honestly, I don't even understand what the issue is here... Could you guys help me out please?

It may be referring to one of my columns in my dataset.csv file named 'Close'

Error message:

File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1599, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/b/PycharmProjects/ANN1a/ANN2-Keras1a", line 41, in <module>
    results = cross_val_score(pipeline, X, Y, cv=kfold)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 342, in cross_val_score
    pre_dispatch=pre_dispatch)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 206, in cross_validate
    for train, test in cv.split(X, y, groups))
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 779, in __call__
    while self.dispatch_one_batch(iterator):
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 625, in dispatch_one_batch
    self._dispatch(tasks)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 588, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 111, in apply_async
    result = ImmediateResult(func)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 332, in __init__
    self.results = batch()
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 488, in _fit_and_score
    test_scores = _score(estimator, X_test, y_test, scorer, is_multimetric)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 523, in _score
    return _multimetric_score(estimator, X_test, y_test, scorer)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 553, in _multimetric_score
    score = scorer(estimator, X_test, y_test)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/scorer.py", line 244, in _passthrough_scorer
    return estimator.score(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/metaestimators.py", line 115, in <lambda>
    out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/pipeline.py", line 486, in score
    Xt = transform.transform(Xt)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/preprocessing/data.py", line 681, in transform
    estimator=self, dtype=FLOAT_DTYPES)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 433, in check_array
    array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: could not convert string to float: Close

Here is my code in its entirety:

import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline


# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=None, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]

# define the model
def larger_model():
	# create model
	model = Sequential()
	model.add(Dense(100, input_dim=4, kernel_initializer='normal', activation='relu'))
	model.add(Dense(50, kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

# evaluate model with standardized dataset
numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=larger_model, epochs=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(pipeline, X, Y, cv=kfold)
print("Standardized: %.2f (%.2f) MSE" % (results.mean(), results.std()))

**Larz60+** · Oct-27-2017, 04:45 PM

what is the name of your program?
The error was generated in validation.py, line 433, but what caused it will be the last line mentioned with your program name on it.

BlackHeart · (This post was last modified: Oct-27-2017, 07:52 PM by BlackHeart.)

(Oct-27-2017, 04:45 PM)Larz60+ Wrote: what is the name of your program?
The error was generated in validation.py, line 433, but what caused it will be the last line mentioned with your program name on it.

Well I've been making a new file and renaming it each time I make a change the code, so that if something doesn't work I can always regress back to where I was. I keep changing the name ann1a,ann1b,ann1c,ann1-keras1a, etc, etc. I have it all stored in the same project folder.

I actually think I may have gotten it to work last night! I changed header from header=none to header=1 and it seemed to realize that 'Close' was a part of the column headers.

before:

# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=none, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]

after:

# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=1, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]

I'm starting to reach my limits with this code though, since I've only been coding with python for a few days now. I don't think I'm getting the output to come out correctly, and I don't understand it enough to fix it. I think its returning a 0.00% accuracy and I don't understand its predictions. I'm trying to get it to crunch 4 columns Open,High,Low,Close and then predict the next Close number.

output:

Larger: 0.00 (0.00) MSE
[ 0.78021598  0.79241288  0.81000006 ...,  3.64232779  3.59621549
  3.79605269]

Here's my code in its entirety:

import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline


# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=1, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]

# define the model
def larger_model():
	# create model
	model = Sequential()
	model.add(Dense(100, input_dim=4, kernel_initializer='normal', activation='relu'))
	model.add(Dense(50, kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='sgd')
	return model

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=larger_model, epochs=100, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(pipeline, X, Y, cv=kfold)
print("Larger: %.2f (%.2f) MSE" % (results.mean(), results.std()))

pipeline.fit(X, Y)
prediction = pipeline.predict(X)
print prediction

quick edit:

Maybe the prediction is suppose to be for Y instead of X right because Y is the output layer?

pipeline.fit(X, Y)
prediction = pipeline.predict(X)
print prediction

pipeline.fit(X, Y)
prediction = pipeline.predict(Y)
print prediction

**Larz60+** · Oct-27-2017, 08:42 PM

a better way to keep backups is to keep the same program name.

Put all source into a directory named src
Create another directory at same node named backup.
Before makng major changes, create a new directory in the backup with a name similar to src_backup_MMDDYY_time
Copy full src directory into newly created backup directory

This way you can go back as far as you need to to restore to a point.

It will make life a lot easire in the long run.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	ValueError: could not convert string to float	badju	0	4,402	Jul-01-2021, 12:13 AM Last Post: badju
	Indirectlty convert string to float in JSON file	WBPYTHON	6	6,120	May-06-2020, 12:09 PM Last Post: WBPYTHON
	ValueError: could not convert string to float	RahulSingh	3	4,332	Apr-09-2020, 02:59 PM Last Post: dinesh
	convert a list of string+bytes into a list of strings (python 3)	pacscaloupsu	4	11,025	Mar-17-2020, 07:21 AM Last Post: markfilan
	Convert dataframe string column to numeric in Python	darpInd	1	2,363	Mar-14-2020, 10:07 AM Last Post: ndc85430
	convert 'A B C' to numpy float matrix	rezabma	4	2,642	Feb-27-2020, 09:48 AM Last Post: rezabma
	ValueError: could not convert string to float: '4 AVENUE'	Kudzo	4	6,078	Jan-26-2020, 10:47 PM Last Post: Kudzo
	Convert 'object' to 'string'	AdWill97	1	62,691	May-06-2019, 08:22 AM Last Post: Yoriz
	ValueError: could not convert the string to float	Grin	3	10,354	Jun-14-2018, 08:17 PM Last Post: killerrex
	Problema with convert image to string	karlo123	1	2,838	May-16-2018, 10:44 PM Last Post: karlo123

ValueError: could not convert string to float: Close??

User Panel Messages

Announcements