ValueError: could not convert string to float: Close?? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Data Science (https://python-forum.io/forum-44.html) +--- Thread: ValueError: could not convert string to float: Close?? (/thread-5900.html) |
ValueError: could not convert string to float: Close?? - BlackHeart - Oct-27-2017 Honestly, I don't even understand what the issue is here... Could you guys help me out please? It may be referring to one of my columns in my dataset.csv file named 'Close' Error message: File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1599, in <module> globals = debugger.run(setup['file'], None, None, is_module) File "/home/b/pycharm-community-2017.2.3/helpers/pydev/pydevd.py", line 1026, in run pydev_imports.execfile(file, globals, locals) # execute the script File "/home/b/PycharmProjects/ANN1a/ANN2-Keras1a", line 41, in <module> results = cross_val_score(pipeline, X, Y, cv=kfold) File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 342, in cross_val_score pre_dispatch=pre_dispatch) File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 206, in cross_validate for train, test in cv.split(X, y, groups)) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 779, in __call__ while self.dispatch_one_batch(iterator): File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 625, in dispatch_one_batch self._dispatch(tasks) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 588, in _dispatch job = self._backend.apply_async(batch, callback=cb) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 111, in apply_async result = ImmediateResult(func) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/_parallel_backends.py", line 332, in __init__ self.results = batch() File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 131, in __call__ return [func(*args, **kwargs) for func, args, kwargs in self.items] File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 488, in _fit_and_score test_scores = _score(estimator, X_test, y_test, scorer, is_multimetric) File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 523, in _score return _multimetric_score(estimator, X_test, y_test, scorer) File "/usr/local/lib/python2.7/dist-packages/sklearn/model_selection/_validation.py", line 553, in _multimetric_score score = scorer(estimator, X_test, y_test) File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/scorer.py", line 244, in _passthrough_scorer return estimator.score(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/metaestimators.py", line 115, in <lambda> out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/pipeline.py", line 486, in score Xt = transform.transform(Xt) File "/usr/local/lib/python2.7/dist-packages/sklearn/preprocessing/data.py", line 681, in transform estimator=self, dtype=FLOAT_DTYPES) File "/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py", line 433, in check_array array = np.array(array, dtype=dtype, order=order, copy=copy) ValueError: could not convert string to float: CloseHere is my code in its entirety: import numpy import pandas from keras.models import Sequential from keras.layers import Dense from keras.wrappers.scikit_learn import KerasRegressor from sklearn.model_selection import cross_val_score from sklearn.model_selection import KFold from sklearn.preprocessing import StandardScaler from sklearn.pipeline import Pipeline # load dataset dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=None, usecols=[1,2,3,4]) dataset = dataframe.values # split into input (X) and output (Y) variables X = dataset[:,0:4] Y = dataset[:,1] # define the model def larger_model(): # create model model = Sequential() model.add(Dense(100, input_dim=4, kernel_initializer='normal', activation='relu')) model.add(Dense(50, kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal')) # Compile model model.compile(loss='mean_squared_error', optimizer='adam') return model # fix random seed for reproducibility seed = 7 numpy.random.seed(seed) # evaluate model with standardized dataset numpy.random.seed(seed) estimators = [] estimators.append(('standardize', StandardScaler())) estimators.append(('mlp', KerasRegressor(build_fn=larger_model, epochs=50, batch_size=5, verbose=0))) pipeline = Pipeline(estimators) kfold = KFold(n_splits=10, random_state=seed) results = cross_val_score(pipeline, X, Y, cv=kfold) print("Standardized: %.2f (%.2f) MSE" % (results.mean(), results.std())) RE: ValueError: could not convert string to float: Close?? - Larz60+ - Oct-27-2017 what is the name of your program? The error was generated in validation.py, line 433, but what caused it will be the last line mentioned with your program name on it. RE: ValueError: could not convert string to float: Close?? - BlackHeart - Oct-27-2017 (Oct-27-2017, 04:45 PM)Larz60+ Wrote: what is the name of your program? Well I've been making a new file and renaming it each time I make a change the code, so that if something doesn't work I can always regress back to where I was. I keep changing the name ann1a,ann1b,ann1c,ann1-keras1a, etc, etc. I have it all stored in the same project folder. I actually think I may have gotten it to work last night! I changed header from header=none to header=1 and it seemed to realize that 'Close' was a part of the column headers. before: # load dataset dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=none, usecols=[1,2,3,4]) dataset = dataframe.values # split into input (X) and output (Y) variables X = dataset[:,0:4] Y = dataset[:,1]after: # load dataset dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=1, usecols=[1,2,3,4]) dataset = dataframe.values # split into input (X) and output (Y) variables X = dataset[:,0:4] Y = dataset[:,1]I'm starting to reach my limits with this code though, since I've only been coding with python for a few days now. I don't think I'm getting the output to come out correctly, and I don't understand it enough to fix it. I think its returning a 0.00% accuracy and I don't understand its predictions. I'm trying to get it to crunch 4 columns Open,High,Low,Close and then predict the next Close number. output: Larger: 0.00 (0.00) MSE [ 0.78021598 0.79241288 0.81000006 ..., 3.64232779 3.59621549 3.79605269]Here's my code in its entirety: import numpy import pandas from keras.models import Sequential from keras.layers import Dense from keras.wrappers.scikit_learn import KerasRegressor from sklearn.model_selection import cross_val_score from sklearn.model_selection import KFold from sklearn.preprocessing import StandardScaler from sklearn.pipeline import Pipeline # load dataset dataframe = pandas.read_csv("PTNprice.csv", delim_whitespace=True, header=1, usecols=[1,2,3,4]) dataset = dataframe.values # split into input (X) and output (Y) variables X = dataset[:,0:4] Y = dataset[:,1] # define the model def larger_model(): # create model model = Sequential() model.add(Dense(100, input_dim=4, kernel_initializer='normal', activation='relu')) model.add(Dense(50, kernel_initializer='normal', activation='relu')) model.add(Dense(1, kernel_initializer='normal')) # Compile model model.compile(loss='mean_squared_error', optimizer='sgd') return model # fix random seed for reproducibility seed = 7 numpy.random.seed(seed) numpy.random.seed(seed) estimators = [] estimators.append(('standardize', StandardScaler())) estimators.append(('mlp', KerasRegressor(build_fn=larger_model, epochs=100, batch_size=5, verbose=0))) pipeline = Pipeline(estimators) kfold = KFold(n_splits=10, random_state=seed) results = cross_val_score(pipeline, X, Y, cv=kfold) print("Larger: %.2f (%.2f) MSE" % (results.mean(), results.std())) pipeline.fit(X, Y) prediction = pipeline.predict(X) print predictionquick edit: Maybe the prediction is suppose to be for Y instead of X right because Y is the output layer? pipeline.fit(X, Y) prediction = pipeline.predict(X) print prediction pipeline.fit(X, Y) prediction = pipeline.predict(Y) print prediction RE: ValueError: could not convert string to float: Close?? - Larz60+ - Oct-27-2017 a better way to keep backups is to keep the same program name.
It will make life a lot easire in the long run. |