Python Forum
Transform 3 Columns into Single Column
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Transform 3 Columns into Single Column
#1
I Have a file comprising 3 columns of data and I would like to transform these columns into a single column, one following the other e.g. below. Could someone kindly tell me how to write the code to do this. It would be much appreciated and get me out of a bind.

3560181.703 -3384947.321 -5.02245
3560191.703 -3384947.321 -4.88665
3560201.703 -3384947.321 -4.76042

Reformat to:

3560181.703
3560191.703
3560201.703
-3384947.321
-3384947.321
-3384947.321
-5.02245
-4.88665
-4.76042
Reply
#2
What've you got? Excel? Pandas df, csv?

You could just read the each row, column by column, into a list, then write that list to a single column.

Or dispense with the list and read each value straight into column 1 in a new file.
Reply
#3
It is a csv far to large for excel to handle, 1 million+ rows.

I just don't have python skills to do this and I can think of no other way.
Reply
#4
Big data needs Pandas, I believe. I never had anything more than 200 rows.

I believe Excel is limited to 4000+ columns, but rows are unlimited.

Anyway, this will give you 3 csv files, 1 for each column. You can then just concatenate them if you wiish

import pandas as pd

csv_file = '/home/pedro/myPython/pandas/testme.csv'
savepath = '/home/pedro/myPython/pandas/'
df = pd.read_csv(csv_file)

# get each column 
columns = ['col1', 'col2', 'col3'] # put your column headers

for colnr in columns:
    mydf = df[colnr]
    mydf.to_csv(savepath + colnr + '.csv', index=False)
Reply
#5
Have a look to "flatten" or "ravel" using Numpy, usinf a single line and going very fast even with million of rows
Reply
#6
Many thanks Pedroski55

I'm getting the following error on this code. Would you have any idea of the problem.

Error:
C:\Python3\python.exe C:/Users/David/PycharmProjects/pythonProject2/venv/Subspend_Z.py Traceback (most recent call last): File "C:\Python3\lib\site-packages\pandas\core\indexes\base.py", line 2889, in get_loc return self._engine.get_loc(casted_key) File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\index.pyx", line 97, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas\_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'col1' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "C:/Users/David/PycharmProjects/pythonProject2/venv/Subspend_Z.py", line 11, in <module> mydf = df[colnr] File "C:\Python3\lib\site-packages\pandas\core\frame.py", line 2899, in __getitem__ indexer = self.columns.get_loc(key) File "C:\Python3\lib\site-packages\pandas\core\indexes\base.py", line 2891, in get_loc raise KeyError(key) from err KeyError: 'col1' Process finished with exit code 1
Reply
#7
I don't know what your column header names are.

Either call the columns col1, col2, col3 etc., or put your column header names in here, for any columns you want to single out.
# get each column 
columns = ['col1', 'col2', 'col3'] # put your column headers
That is to say, look in row 1 of your csv and see what the names of the columns you want to get are.
If there are no column headers, I believe Pandas will set something like col1, col2 ... but I don't really use Pandas for my little amounts of data.
Best set the names you want in row 1.
Reply
#8
Thank you so much Pedroski55, it worked a treat.

The only issue is that the transformation to individual CVS files left off the decimal places. Is there something that I can include in the script that will bring in the values say to 2 decimal places.

Kind regards
Dave
Reply
#9
I am really not familiar with Pandas, never having large amounts of data to process, but this seems to work
pd.options.display.float_format = '{:,.2f}'.format (change the 2 for more decimal places I believe)

import pandas as pd

csv_file = '/home/pedro/myPython/pandas/testme.csv'
savepath = '/home/pedro/myPython/pandas/'
pd.options.display.float_format = '{:,.2f}'.format
df = pd.read_csv(csv_file)

# get each column 
columns = ['col1', 'col2', 'col3'] # put your column headers

for colnr in columns:
    mydf = df[colnr]
    mydf.to_csv(savepath + colnr + '.csv', index=False)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Converting column of values into muliple columns of counts highland44 0 251 Feb-01-2024, 12:48 AM
Last Post: highland44
  [SOLVED] Looking for documentation on Reportlab's canvas.transform() function NeilUK 1 604 Aug-23-2023, 01:21 PM
Last Post: NeilUK
  Reshaping a single column in to multiple column using Python sahar 7 2,039 Jun-20-2022, 12:35 PM
Last Post: deanhystad
  df column aggregate and group by multiple columns SriRajesh 0 1,033 May-06-2022, 02:26 PM
Last Post: SriRajesh
  Transform Dic to dataframe d9d9d 4 1,374 Apr-14-2022, 09:35 AM
Last Post: perfringo
  Split single column to multiple columns SriRajesh 1 1,319 Jan-07-2022, 06:43 PM
Last Post: jefsummers
  How to transform from wide to long format in python shantanu97 1 1,642 Nov-21-2021, 11:53 AM
Last Post: buran
  How to remove a column or two columns in a correlation heatmap? lulu43366 3 5,195 Sep-30-2021, 03:47 PM
Last Post: lulu43366
  Index error - columns vs non-column Vinny 3 4,910 Aug-09-2021, 04:46 PM
Last Post: snippsat
  Need Help splitting a column into many columns cokicoki 0 1,395 Jun-18-2021, 12:30 PM
Last Post: cokicoki

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020