Posts: 4
Threads: 1
Joined: Jun 2022
Jun-19-2022, 10:45 AM
(This post was last modified: Jun-19-2022, 02:19 PM by sahar.)
I have an excel file containing a single column (Row's number is not fixed). Using Python 3, I want to,
- Import my excel file/data in python,
- Read/select the data column (first column), and
- Reshape this column into multiple columns having 10 rows in each column and finally
- Writing output to a new excel file.
I am a Matlab user and know how to do this in Matlab using Matlab’s reshape command Quote:newVar = reshape(myColumn, 10, [])
.
Looking for someone to help me out achieving this in Python 3.
Posts: 1,358
Threads: 2
Joined: May 2019
We will help you, but you need to show effort and let us know where you get stuck. Of course, there is not just one way, so my suggestions are how I would approach the problem. As a start, use Pandas to import into a dataframe. Use the iloc function to select your range in the desired column, and copy that data to the new column.
See Pandas documentation
Posts: 4
Threads: 1
Joined: Jun 2022
Jun-19-2022, 02:48 PM
(This post was last modified: Jun-19-2022, 03:01 PM by sahar.)
Ok, I have tried but could not get the desired result.
import pandas as pd
import numpy as np
df = pd.read_excel('sample.xlsx')
first_column = pd.DataFrame(df.iloc[:,0])
arr = np.array(first_column)
newArr = arr.reshape(arr, (10, -1)) Code shows the error:
Quote:newArr = arr.reshape(10, -1)
Quote:TypeError: only integer scalar arrays can be converted to a scalar index
Posts: 6,798
Threads: 20
Joined: Feb 2020
Jun-19-2022, 07:20 PM
(This post was last modified: Jun-19-2022, 07:22 PM by deanhystad.)
reshape is a function, not a method of nparray. The call is "new_array = numpy.reshape(arr, (10, -1))".
Posts: 4
Threads: 1
Joined: Jun 2022
Jun-20-2022, 05:45 AM
(This post was last modified: Jun-20-2022, 08:24 AM by sahar.)
(Jun-19-2022, 07:20 PM)deanhystad Wrote: reshape is a function, not a method of nparray. The call is "new_array = numpy.reshape(arr, (10, -1))".
Hi, thanks for your explanation. I have tried this but no results.
import pandas as pd
import numpy as np
df = pd.read_excel('sample.xlsx')
myCol = pd.DataFrame(df.iloc[:,0])
arr = np.array (myCol)
newArr = np.reshape(arr, (10, -1), order='F')
np.savetxt("newMatrix.csv", newArr, delimiter=",")
Posts: 4
Threads: 1
Joined: Jun 2022
Jun-20-2022, 08:25 AM
(This post was last modified: Jun-20-2022, 08:25 AM by sahar.)
I have found a strange thing. If i add one more row in my column, the same code performs as intended.
Image
Please note now I have 31 rows. To me it seems that something has to do with indexing.
Posts: 1,358
Threads: 2
Joined: May 2019
The default in pandas read_excel() function is that the first row is used as a header. If you start with data then add the parameter header=None
Posts: 6,798
Threads: 20
Joined: Feb 2020
Always helps to read the documentation, especially when trying something new.
https://pandas.pydata.org/pandas-docs/st...excel.html
Quote:headerint, list of int, default 0
Row (0-indexed) to use for the column labels of the parsed DataFrame. If a list of integers is passed those row positions will be combined into a MultiIndex. Use None if there is no header.
https://numpy.org/doc/stable/reference/g...shape.html
Quote:numpy.reshape(a, newshape, order='C')[source]
|