Python Forum
Running Standard Scaler in Python 3
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Running Standard Scaler in Python 3
#1
I am trying to get the following code to work:

df.head()

df2 = df.drop(["Unnamed: 0", "timestamp"], axis=1, inplace=True)

df3=pd.DataFrame(df2)

type(df2)

df3.head()
At this time, it fails when I try to put df3 thru a standard scaler. I will show the code"

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [91], in <cell line: 2>()
      1 scaler=StandardScaler()
----> 2 df4=scaler.fit_transform(df3)

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\base.py:867, in TransformerMixin.fit_transform(self, X, y, **fit_params)
    863 # non-optimized default implementation; override when a better
    864 # method is possible for a given clustering algorithm
    865 if y is None:
    866     # fit method of arity 1 (unsupervised transformation)
--> 867     return self.fit(X, **fit_params).transform(X)
    868 else:
    869     # fit method of arity 2 (supervised transformation)
    870     return self.fit(X, y, **fit_params).transform(X)

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\preprocessing\_data.py:809, in StandardScaler.fit(self, X, y, sample_weight)
    807 # Reset internal state before fitting
    808 self._reset()
--> 809 return self.partial_fit(X, y, sample_weight)

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\preprocessing\_data.py:844, in StandardScaler.partial_fit(self, X, y, sample_weight)
    812 """Online computation of mean and std on X for later scaling.
    813 
    814 All of X is processed as a single batch. This is intended for cases
   (...)
    841     Fitted scaler.
    842 """
    843 first_call = not hasattr(self, "n_samples_seen_")
--> 844 X = self._validate_data(
    845     X,
    846     accept_sparse=("csr", "csc"),
    847     dtype=FLOAT_DTYPES,
    848     force_all_finite="allow-nan",
    849     reset=first_call,
    850 )
    851 n_features = X.shape[1]
    853 if sample_weight is not None:

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\base.py:577, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)
    575     raise ValueError("Validation should be done on X, y or both.")
    576 elif not no_val_X and no_val_y:
--> 577     X = check_array(X, input_name="X", **check_params)
    578     out = X
    579 elif no_val_X and not no_val_y:

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\sklearn\utils\validation.py:768, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
    764     pandas_requires_conversion = any(
    765         _pandas_dtype_needs_early_conversion(i) for i in dtypes_orig
    766     )
    767     if all(isinstance(dtype_iter, np.dtype) for dtype_iter in dtypes_orig):
--> 768         dtype_orig = np.result_type(*dtypes_orig)
    770 if dtype_numeric:
    771     if dtype_orig is not None and dtype_orig.kind == "O":
    772         # if input is object, convert to float.

File <__array_function__ internals>:180, in result_type(*args, **kwargs)

ValueError: at least one array or dtype is required

1
df.index
I am not sure what this error is talking about.

I just want to get df3 in a form that standardscaler can use. I think it can only accept dataframe so I convert it to a dataframe in the last step before sending it to standardscaler. Then I get the error. What am I doing wrong here. It seems okay.

Any help appreciated.

Respectfully,

LZ
Reply
#2
It is telling you that you cannot do this:
df3=pd.DataFrame(None)
Which is what you are doing because df2 == None. df2 == None because you cannot do this:
df2 = df.drop(["Unnamed: 0", "timestamp"], axis=1, inplace=True)
From the pandas documentation
https://pandas.pydata.org/pandas-docs/st....drop.html
Quote:inplacebool, default False
If False, return a copy. Otherwise, do operation inplace and return None.
So either you can do this:
df2 = df.drop(["Unnamed: 0", "timestamp"], axis=1)
Or you can do this:
df.drop(["Unnamed: 0", "timestamp"], axis=1, inplace=True)
Assignment of the result, and using "inplace=True" can never be used together
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  python standard way of importing library mg24 1 915 Nov-15-2022, 01:41 AM
Last Post: deanhystad
  Scaler fit with different colums HoldYourBreath 0 1,338 Jan-10-2021, 11:53 AM
Last Post: HoldYourBreath
  Winsorized Mean and Standard Deviation Wheeliam 0 1,826 Jul-11-2020, 05:27 PM
Last Post: Wheeliam
  standard library modules chpyel 4 2,838 May-10-2020, 02:58 PM
Last Post: snippsat
  Is there a standard for autocommit In PEP 249 zatlas1 10 5,270 Feb-06-2019, 04:56 PM
Last Post: buran
  Graphics and standard deviation rocioaraneda 3 2,736 Jan-09-2019, 10:53 PM
Last Post: micseydel
  standard data types rombertus 3 63,129 Dec-23-2018, 08:52 PM
Last Post: rombertus
  Fatal Python error: init_sys_streams: can't initialize sys standard streams Attribute FatalPythonError 24 58,092 Aug-22-2018, 06:10 PM
Last Post: FatalPythonError
  Join the Python Standard Library to my project sylas 1 2,208 May-16-2018, 05:59 AM
Last Post: buran
  Do you know how to import Python Standard Library sylas 30 14,251 Jan-26-2018, 01:32 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020