Python Forum
ValueError: could not convert string to float: Close??
Thread Rating:
  • 1 Vote(s) - 2 Average
  • 1
  • 2
  • 3
  • 4
  • 5
ValueError: could not convert string to float: Close??
#1
Honestly, I don't even understand what the issue is here... Could you guys help me out please?

It may be referring to one of my columns in my dataset.csv file named 'Close'

Error message:

File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1599, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/b/PycharmProjects/ANN1a/ANN2-Keras1a", line 41, in <module>
    results = cross_val_score(pipeline, X, Y, cv=kfold)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 342, in cross_val_score
    pre_dispatch=pre_dispatch)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 206, in cross_validate
    for train, test in cv.split(X, y, groups))
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 779, in __call__
    while self.dispatch_one_batch(iterator):
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 625, in dispatch_one_batch
    self._dispatch(tasks)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 588, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 111, in apply_async
    result = ImmediateResult(func)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 332, in __init__
    self.results = batch()
  File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__
    return [func(*args, **kwargs) for func, args, kwargs in self.items]
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 488, in _fit_and_score
    test_scores = _score(estimator, X_test, y_test, scorer, is_multimetric)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 523, in _score
    return _multimetric_score(estimator, X_test, y_test, scorer)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 553, in _multimetric_score
    score = scorer(estimator, X_test, y_test)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/scorer.py", line 244, in _passthrough_scorer
    return estimator.score(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/metaestimators.py", line 115, in <lambda>
    out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/pipeline.py", line 486, in score
    Xt = transform.transform(Xt)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/preprocessing/data.py", line 681, in transform
    estimator=self, dtype=FLOAT_DTYPES)
  File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 433, in check_array
    array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: could not convert string to float: Close
Here is my code in its entirety:

import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline


# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=None, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]

# define the model
def larger_model():
	# create model
	model = Sequential()
	model.add(Dense(100, input_dim=4, kernel_initializer='normal', activation='relu'))
	model.add(Dense(50, kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='adam')
	return model

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

# evaluate model with standardized dataset
numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=larger_model, epochs=50, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(pipeline, X, Y, cv=kfold)
print("Standardized: %.2f (%.2f) MSE" % (results.mean(), results.std()))
Reply
#2
what is the name of your program?
The error was generated in validation.py, line 433, but what caused it will be the last line mentioned with your program name on it.
Reply
#3
(Oct-27-2017, 04:45 PM)Larz60+ Wrote: what is the name of your program?
The error was generated in validation.py, line 433, but what caused it will be the last line mentioned with your program name on it.

Well I've been making a new file and renaming it each time I make a change the code, so that if something doesn't work I can always regress back to where I was. I keep changing the name ann1a,ann1b,ann1c,ann1-keras1a, etc, etc. I have it all stored in the same project folder.

I actually think I may have gotten it to work last night! I changed header from header=none to header=1 and it seemed to realize that 'Close' was a part of the column headers.

before:
# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=none, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]
after:

# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=1, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]
I'm starting to reach my limits with this code though, since I've only been coding with python for a few days now. I don't think I'm getting the output to come out correctly, and I don't understand it enough to fix it. I think its returning a 0.00% accuracy and I don't understand its predictions. I'm trying to get it to crunch 4 columns Open,High,Low,Close and then predict the next Close number.

output:
Larger: 0.00 (0.00) MSE
[ 0.78021598  0.79241288  0.81000006 ...,  3.64232779  3.59621549
  3.79605269]
Here's my code in its entirety:

import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline


# load dataset
dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=1, usecols=[1,2,3,4])
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:4]
Y = dataset[:,1]

# define the model
def larger_model():
	# create model
	model = Sequential()
	model.add(Dense(100, input_dim=4, kernel_initializer='normal', activation='relu'))
	model.add(Dense(50, kernel_initializer='normal', activation='relu'))
	model.add(Dense(1, kernel_initializer='normal'))
	# Compile model
	model.compile(loss='mean_squared_error', optimizer='sgd')
	return model

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=larger_model, epochs=100, batch_size=5, verbose=0)))
pipeline = Pipeline(estimators)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(pipeline, X, Y, cv=kfold)
print("Larger: %.2f (%.2f) MSE" % (results.mean(), results.std()))

pipeline.fit(X, Y)
prediction = pipeline.predict(X)
print prediction
quick edit:

Maybe the prediction is suppose to be for Y instead of X right because Y is the output layer?

pipeline.fit(X, Y)
prediction = pipeline.predict(X)
print prediction
pipeline.fit(X, Y)
prediction = pipeline.predict(Y)
print prediction
Reply
#4
a better way to keep backups is to keep the same program name.
  • Put all source into a directory named src
  • Create another directory at same node named backup.
  • Before makng major changes, create a new directory in the backup with a name similar to src_backup_MMDDYY_time
  • Copy full src directory into newly created backup directory
This way you can go back as far as you need to to restore to a point.

It will make life a lot easire in the long run.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Sad ValueError: could not convert string to float badju 0 4,294 Jul-01-2021, 12:13 AM
Last Post: badju
  Indirectlty convert string to float in JSON file WBPYTHON 6 5,830 May-06-2020, 12:09 PM
Last Post: WBPYTHON
  ValueError: could not convert string to float RahulSingh 3 4,116 Apr-09-2020, 02:59 PM
Last Post: dinesh
  convert a list of string+bytes into a list of strings (python 3) pacscaloupsu 4 10,741 Mar-17-2020, 07:21 AM
Last Post: markfilan
  Convert dataframe string column to numeric in Python darpInd 1 2,270 Mar-14-2020, 10:07 AM
Last Post: ndc85430
  convert 'A B C' to numpy float matrix rezabma 4 2,491 Feb-27-2020, 09:48 AM
Last Post: rezabma
  ValueError: could not convert string to float: '4 AVENUE' Kudzo 4 5,867 Jan-26-2020, 10:47 PM
Last Post: Kudzo
  Convert 'object' to 'string' AdWill97 1 62,346 May-06-2019, 08:22 AM
Last Post: Yoriz
  ValueError: could not convert the string to float Grin 3 10,183 Jun-14-2018, 08:17 PM
Last Post: killerrex
  Problema with convert image to string karlo123 1 2,747 May-16-2018, 10:44 PM
Last Post: karlo123

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020