![]() |
Running Standard Scaler in Python 3 - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Running Standard Scaler in Python 3 (/thread-38116.html) |
Running Standard Scaler in Python 3 - Led_Zeppelin - Sep-05-2022 I am trying to get the following code to work: df.head() df2 = df.drop(["Unnamed: 0", "timestamp"], axis=1, inplace=True) df3=pd.DataFrame(df2) type(df2) df3.head()At this time, it fails when I try to put df3 thru a standard scaler. I will show the code" --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Input In [91], in <cell line: 2>() 1 scaler=StandardScaler() ----> 2 df4=scaler.fit_transform(df3) File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\base.py:867, in TransformerMixin.fit_transform(self, X, y, **fit_params) 863 # non-optimized default implementation; override when a better 864 # method is possible for a given clustering algorithm 865 if y is None: 866 # fit method of arity 1 (unsupervised transformation) --> 867 return self.fit(X, **fit_params).transform(X) 868 else: 869 # fit method of arity 2 (supervised transformation) 870 return self.fit(X, y, **fit_params).transform(X) File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\preprocessing\_data.py:809, in StandardScaler.fit(self, X, y, sample_weight) 807 # Reset internal state before fitting 808 self._reset() --> 809 return self.partial_fit(X, y, sample_weight) File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\preprocessing\_data.py:844, in StandardScaler.partial_fit(self, X, y, sample_weight) 812 """Online computation of mean and std on X for later scaling. 813 814 All of X is processed as a single batch. This is intended for cases (...) 841 Fitted scaler. 842 """ 843 first_call = not hasattr(self, "n_samples_seen_") --> 844 X = self._validate_data( 845 X, 846 accept_sparse=("csr", "csc"), 847 dtype=FLOAT_DTYPES, 848 force_all_finite="allow-nan", 849 reset=first_call, 850 ) 851 n_features = X.shape[1] 853 if sample_weight is not None: File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\base.py:577, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params) 575 raise ValueError("Validation should be done on X, y or both.") 576 elif not no_val_X and no_val_y: --> 577 X = check_array(X, input_name="X", **check_params) 578 out = X 579 elif no_val_X and not no_val_y: File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\utils\validation.py:768, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name) 764 pandas_requires_conversion = any( 765 _pandas_dtype_needs_early_conversion(i) for i in dtypes_orig 766 ) 767 if all(isinstance(dtype_iter, np.dtype) for dtype_iter in dtypes_orig): --> 768 dtype_orig = np.result_type(*dtypes_orig) 770 if dtype_numeric: 771 if dtype_orig is not None and dtype_orig.kind == "O": 772 # if input is object, convert to float. File <__array_function__ internals>:180, in result_type(*args, **kwargs) ValueError: at least one array or dtype is required 1 df.indexI am not sure what this error is talking about. I just want to get df3 in a form that standardscaler can use. I think it can only accept dataframe so I convert it to a dataframe in the last step before sending it to standardscaler. Then I get the error. What am I doing wrong here. It seems okay. Any help appreciated. Respectfully, LZ RE: Running Standard Scaler in Python 3 - deanhystad - Sep-05-2022 It is telling you that you cannot do this: df3=pd.DataFrame(None)Which is what you are doing because df2 == None. df2 == None because you cannot do this: df2 = df.drop(["Unnamed: 0", "timestamp"], axis=1, inplace=True)From the pandas documentation https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html Quote:inplacebool, default FalseSo either you can do this: df2 = df.drop(["Unnamed: 0", "timestamp"], axis=1)Or you can do this: df.drop(["Unnamed: 0", "timestamp"], axis=1, inplace=True)Assignment of the result, and using "inplace=True" can never be used together |