cannot get code to work

Led_Zeppelin · (This post was last modified: Jun-30-2022, 01:24 PM by Led_Zeppelin.)

I have spent a day is trying to get the following code to work.

        
              df2 = pd.DataFrame(df, columns = ['sensor_00', 'sensor_01', 'sensor_02', 'sensor_03', 'sensor_04', 'sensor_05', 'sensor_06', \
'sensor_07', 'sensor_08', 'sensor_09', 'sensor_10', 'sensor_11', 'sensor_12', 'sensor_13', 'sensor_14', 'sensor_15', \
'sensor_16', 'sensor_17', 'sensor_18', 'sensor_19', 'sensor_20', 'sensor_21', 'sensor_22', 'sensor_23', 'sensor_24', \
'sensor_25', 'sensor_26', 'sensor_27', 'sensor_28', 'sensor_29', 'sensor_30', 'sensor_31', 'sensor_32', 'sensor_33', \
'sensor_34', 'sensor_35', 'sensor_36', 'sensor_37', 'sensor_38', 'sensor_39', 'sensor_40', 'sensor_41', 'sensor_42', \                                  
'sensor_43', 'sensor_44', 'sensor_45', 'sensor_46', 'sensor_47', 'sensor_48', 'sensor_49', 'sensor_50', 'sensor_51'])

It fails with a statement that there is an error is line 5. The error reads

Error:Input In [13]
    'sensor_34', 'sensor_35', 'sensor_36', 'sensor_37', 'sensor_38', 'sensor_39', 'sensor_40', 'sensor_41', 'sensor_42', \
                                                                                                                                                            
^
SyntaxError: unexpected character after line continuation character

Now there is nothing that I can see that is after line continuation character. But the error says there is.

What is going on here?

I just want the dataframe to look like this.

        
              Unnamed: 0  sensor_00   sensor_01   sensor_02   sensor_03   sensor_04   sensor_05   sensor_06   sensor_07   sensor_08   ... sensor_42   sensor_43   sensor_44   sensor_45   sensor_46   sensor_47   sensor_48   sensor_49   sensor_50   sensor_51
0   0   2.465394    47.09201    53.2118 46.310760   634.3750    76.45975    13.41146    16.13136    15.56713    ... 31.770832   41.92708    39.641200   65.68287    50.92593    38.194440   157.9861    67.70834    243.0556    201.3889
1   1   2.465394    47.09201    53.2118 46.310760   634.3750    76.45975    13.41146    16.13136    15.56713    ... 31.770832   41.92708    39.641200   65.68287    50.92593    38.194440   157.9861    67.70834    243.0556    201.3889
2   2   2.444734    47.35243    53.2118 46.397570   638.8889    73.54598    13.32465    16.03733    15.61777    ... 31.770830   41.66666    39.351852   65.39352    51.21528    38.194443   155.9606    67.12963    241.3194    203.7037
3   3   2.460474    47.09201    53.1684 46.397568   628.1250    76.98898    13.31742    16.24711    15.69734    ... 31.510420   40.88541    39.062500   64.81481    51.21528    38.194440   155.9606    66.84028    240.4514    203.1250
4   4   2.445718    47.13541    53.2118 46.397568   636.4583    76.58897    13.35359    16.21094    15.69734    ... 31.510420   41.40625    38.773150   65.10416    51.79398    38.773150   158.2755    66.55093    242.1875    201.3889

with the headings in place and of course the numerical columns normalized and scaled.
But I keep getting this error.

What is wrong?

Any help appreciated.

Respectfully,

LZ

**buran** · (This post was last modified: Jun-30-2022, 01:47 PM by buran.)

try to delete anything between \ and first char on next line, maybe some ghost non-printable char? i.e. join the two lines, then add new line char again

And even better, just dynamically create the column names

        
              columns = [f'sensor_{idx:02d}' for idx in range(52)]
df2 = pd.DataFrame(df, columns=columns)

by the way, why create second dataframe from what looks like a DataFrame df itself? just to set the column names?

cubangt · Jun-30-2022, 01:53 PM

Looks like there is alot of "spaces" after that line

        
              'sensor_34', 'sensor_35', 'sensor_36', 'sensor_37', 'sensor_38', 'sensor_39', 'sensor_40', 'sensor_41', 'sensor_42', \                                  
'sensor_43', 'sensor_44', 'sensor_45', 'sensor_46', 'sensor_47', 'sensor_48', 'sensor_49', 'sensor_50', 'sensor_51'])

After the 'sensor_42', \ there is alot of spaces

Led_Zeppelin · Jun-30-2022, 02:38 PM

There are a lot of spaces after the line continuation character. But that is just it, they are spaces and not characters.

I created a new dataframe for the sole purpose of preserving those deleted columns and their content and their position in the dataframe.

Somehow, and I am not sure how, I plan to copy them from the complete dataframe to the slimmed down dataframe (with the newly added headers) and place them in the exact position that they were originally in the first dataframe.

That way I get the dataframe as it originally was, but with scaled and normalized numeric columns.

I know of no other way, but if there is one, then please let me know.

Respectfully,

LZ

***snippsat*** · Jun-30-2022, 02:51 PM

If i just copy the code you have posted and run it.
There is no SyntaxError.

        
              import pandas as pd
 
df2 = pd.DataFrame(df, columns = ['sensor_00', 'sensor_01', 'sensor_02', 'sensor_03', 'sensor_04', 'sensor_05', 'sensor_06', \
'sensor_07', 'sensor_08', 'sensor_09', 'sensor_10', 'sensor_11', 'sensor_12', 'sensor_13', 'sensor_14', 'sensor_15', \
'sensor_16', 'sensor_17', 'sensor_18', 'sensor_19', 'sensor_20', 'sensor_21', 'sensor_22', 'sensor_23', 'sensor_24', \
'sensor_25', 'sensor_26', 'sensor_27', 'sensor_28', 'sensor_29', 'sensor_30', 'sensor_31', 'sensor_32', 'sensor_33', \
'sensor_34', 'sensor_35', 'sensor_36', 'sensor_37', 'sensor_38', 'sensor_39', 'sensor_40', 'sensor_41', 'sensor_42', \
'sensor_43', 'sensor_44', 'sensor_45', 'sensor_46', 'sensor_47', 'sensor_48', 'sensor_49', 'sensor_50', 'sensor_51'])

Error:Traceback (most recent call last):
  File "<module2>", line 3, in <module>
NameError: name 'df' is not defined

To fix the NameError.

        
              import pandas as pd
 
df = [[0 for i in range(52)] for j in range(52)]
df2 = pd.DataFrame(df, columns = ['sensor_00', 'sensor_01', 'sensor_02', 'sensor_03', 'sensor_04', 'sensor_05', 'sensor_06', \
'sensor_07', 'sensor_08', 'sensor_09', 'sensor_10', 'sensor_11', 'sensor_12', 'sensor_13', 'sensor_14', 'sensor_15', \
'sensor_16', 'sensor_17', 'sensor_18', 'sensor_19', 'sensor_20', 'sensor_21', 'sensor_22', 'sensor_23', 'sensor_24', \
'sensor_25', 'sensor_26', 'sensor_27', 'sensor_28', 'sensor_29', 'sensor_30', 'sensor_31', 'sensor_32', 'sensor_33', \
'sensor_34', 'sensor_35', 'sensor_36', 'sensor_37', 'sensor_38', 'sensor_39', 'sensor_40', 'sensor_41', 'sensor_42', \
'sensor_43', 'sensor_44', 'sensor_45', 'sensor_46', 'sensor_47', 'sensor_48', 'sensor_49', 'sensor_50', 'sensor_51'])

        
              >>> df2
    sensor_00  sensor_01  sensor_02  ...  sensor_49  sensor_50  sensor_51
0           0          0          0  ...          0          0          0
1           0          0          0  ...          0          0          0
2           0          0          0  ...          0          0          0
3           0          0          0  ...          0          0          0
4           0          0          0  ...          0          0          0
5           0          0          0  ...          0          0          0
6           0          0          0  ...          0          0          0
7           0          0          0  ...          0          0          0
8           0          0          0  ...          0          0          0
..... ect

Led_Zeppelin · (This post was last modified: Jun-30-2022, 02:53 PM by Led_Zeppelin.)

I just ran that 2-code line you told me, and I got.

Error:ValueError                                Traceback (most recent call last)
Input In [15], in <cell line: 2>()
      1 columns = [f'sensor_(idx:02d)' for idx in range(52)]
----> 2 df2 = pd.DataFrame(df, columns=columns)

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\frame.py:694, in DataFrame.__init__(self, data, index, columns, dtype, copy)
    684         mgr = dict_to_mgr(
    685             # error: Item "ndarray" of "Union[ndarray, Series, Index]" has no
    686             # attribute "name"
   (...)
    691             typ=manager,
    692         )
    693     else:
--> 694         mgr = ndarray_to_mgr(
    695             data,
    696             index,
    697             columns,
    698             dtype=dtype,
    699             copy=copy,
    700             typ=manager,
    701         )
    703 # For data is list-like, or Iterable (will consume into list)
    704 elif is_list_like(data):

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\internals\construction.py:351, in ndarray_to_mgr(values, index, columns, dtype, copy, typ)
    346 # _prep_ndarray ensures that values.ndim == 2 at this point
    347 index, columns = _get_axes(
    348     values.shape[0], values.shape[1], index=index, columns=columns
    349 )
--> 351 _check_values_indices_shape_match(values, index, columns)
    353 if typ == "array":
    355     if issubclass(values.dtype.type, str):

File ~\miniconda3\envs\pump-failure-pred\lib\site-packages\pandas\core\internals\construction.py:422, in _check_values_indices_shape_match(values, index, columns)
    420 passed = values.shape
    421 implied = (len(index), len(columns))
--> 422 raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")

ValueError: Shape of passed values is (220320, 53), indices imply (220320, 52)

My only guess is that the vector starts at 0 and not 1. Thus, it has 53 and not 52. How to fix?

My guess is to change 52 to 51 in the first line.

Respectfully,

LZ

**buran** · (This post was last modified: Jun-30-2022, 06:10 PM by buran.)

your columns list is from 00 to 51, that means range(52). Note that the shape of values is (220320, 53) so I guess you actually need range(53)

also, your code with error has list with 52 sensors, but "expected" result also show column Unnamed: 0
So, again - you have 53 columns

**deanhystad** · (This post was last modified: Jun-30-2022, 03:45 PM by deanhystad.)

According to the documentation you should be able to make a new dataframe from an existing dataframe but it does not work for me. A scaled down version.

        
              import pandas as pd
 
data = [[i for i in range(1, 11)] for _ in range(5)]
df = pd.DataFrame(data, columns=[f"orig {i}" for i in range(10)])
print(df)
 
df2 = pd.DataFrame(df, columns=[f"copy {i}" for i in range(10)])
print(df2)

Output:   orig 0  orig 1  orig 2  orig 3  orig 4  orig 5  orig 6  orig 7  orig 8  orig 9
0       1       2       3       4       5       6       7       8       9      10
1       1       2       3       4       5       6       7       8       9      10
2       1       2       3       4       5       6       7       8       9      10
3       1       2       3       4       5       6       7       8       9      10
4       1       2       3       4       5       6       7       8       9      10
   copy 0  copy 1  copy 2  copy 3  copy 4  copy 5  copy 6  copy 7  copy 8  copy 9
0     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
1     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
2     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
3     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
4     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN

Led_Zeppelin · (This post was last modified: Jun-30-2022, 06:00 PM by Led_Zeppelin.)

It worked for 53 columns. I am not sure why these extra columns were added. I am not sure they are even needed. The
magic number for me is 53 (through trial and error) as in:

        
              columns = [f'sensor_{idx:02d}' for idx in range(53)]
df2 = pd.DataFrame(df, columns=columns)

So, should I get rid of all columns that come before sensor_01? I cannot see in any situation where I will need them.

I have not tried range(54). I will now, but as I said 53 worked.

My main concern now is, as I explained previously, is putting the nonnumeric columns onto the slimmed down dataframe in the right order and
with all of their values they had when I uploaded the cv file at the beginning of the program. That is why I made a copy of the dataframe in a previous line.

So how to do that?

Respectfully,

LZ

**buran** · (This post was last modified: Jun-30-2022, 06:15 PM by buran.)

(Jun-30-2022, 06:00 PM)Led_Zeppelin Wrote: I have not tried range(54). I will now, but as I said 53 worked.

Sorry, it was typo, as I was replying on the phone. I fixed it. The point is -> 52 columns - range(52). 53 columns - range(53). As I mentioned - I guess you have 53 columns

You are not giving much information, so we can just say - drop the columns you don't need. However there might be other options - e.g. if you read df from file, you can skip columns you don't want to include

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Can't get graph code to work properly.	KDDDC2DS	1	685	Sep-16-2024, 09:17 PM Last Post: deanhystad
	I can't for the life of me get this basic If statement code to work	CandleType1a	8	2,330	May-21-2024, 03:58 PM Last Post: CandleType1a
	hi need help to make this code work correctly	atulkul1985	5	1,978	Nov-20-2023, 04:38 PM Last Post: deanhystad
	newbie question - can't make code work	tronic72	2	1,560	Oct-22-2023, 09:08 PM Last Post: tronic72
	Beginner: Code not work when longer list	raiviscoding	2	1,751	May-19-2023, 11:19 AM Last Post: deanhystad
	Why doesn't this code work? What is wrong with path?	Melcu54	7	3,588	Jan-29-2023, 06:24 PM Last Post: Melcu54
	Code used to work 100%, now sometimes works!	muzicman0	5	2,790	Jan-13-2023, 05:09 PM Last Post: muzicman0
	color code doesn't work	harryvl	1	1,873	Dec-29-2022, 08:59 PM Last Post: deanhystad
	Something the code dont work	AlexPython	13	4,434	Oct-17-2022, 08:34 PM Last Post: AlexPython
	How does this code work?	pd_minh12	3	2,117	Apr-15-2022, 02:50 AM Last Post: Pedroski55

cannot get code to work

User Panel Messages

Announcements