Python Forum

Full Version: [pandas] How to re-arrange DataFrame columns
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,
I have below pandas dataframe:

         Trial1       Trial1      Trial1        Trial2       Trial2  
Name     Sub_item1    Sub_item2  Sub_item3      Sub_item4    Sub_item5
          2019-06-01  2016-06-01  2019-06-01    2019-06-01   2019-06-01
 VBA        1               0        0            1             1
 VLK        0               0        1            1             1
 VBN        1               1        1            1             1
Now I want to arrange as below(desired output):


Name    Date Trail  Sub_item   value 
 VBA     2019-06-01  Sub_item1  1
 VBA     2019-06-01  Sub_item2  0
 VBA     2019-06-01  Sub_item3  0
 VBA     2019-06-01  Sub_item4  1
 VBA     2019-06-01  Sub_item5  1
 VLK     2019-06-01  Sub_item1  0
 VLK     2019-06-01  Sub_item2  0
 VLK     2019-06-01  Sub_item3  1
 VLK     2019-06-01  Sub_item4  1
 VLK     2019-06-01  Sub_item5  1
 VBN     2019-06-01  Sub_item1  1
 VBN     2019-06-01  Sub_item2  1
 VBN     2019-06-01  Sub_item3  1
 VBN     2019-06-01  Sub_item4  1
 VBN     2019-06-01  Sub_item5  1

I am very new to python, can anybody kindly help me how to do this,
What have you tried?
I hope the following example helps you:

import pandas as pd
micolumns = pd.MultiIndex.from_tuples([('X', 'foo', '10'), ('X', 'bar', '10'),
                                       ('Y', 'foo', '10'), ('Y', 'bar', '10')],
                                      names=['l0', 'l1', 'l2'])
arr = pd.DataFrame(pd.np.arange(12).reshape(3,4), columns=micolumns)

arr.T.reset_index()   # this almost what you want.
It gives some error: I use python3.6:
TypeError: '>' not supported between instances of 'str' and 'int'
At least, you've used symbol > in your code. Probably, you need to convert column dtype first
to be able to use comparison operators. Show your code, please.
I test the below code which you provide,

import pandas as pd
micolumns = pd.MultiIndex.from_tuples([('X', 'foo', '10'), ('X', 'bar', '10'),
                                       ('Y', 'foo', '10'), ('Y', 'bar', '10')],
                                      names=['l0', 'l1', 'l2'])
arr = pd.DataFrame(pd.np.arange(12).reshape(3,4), columns=micolumns)
 
arr.T.reset_index()
The code above runs without any errors on my computer (pandas version: 0.23.4). reset_index method turns pandas multi-index to columns, that is almost what you want.
Sorry I am new, but as a question/suggestion, can you use group by ?
something like:
df.groupby(by=['Name','Trial1Sub_item1','Trial1Sub_item2','Trial1Sub_item3','Trial2Sub_item4','Trial2Sub_item5'])
** No idea if this would work and the other suggestions are probably better
Other idea (again probably wont work) is .transpose
Once the data frame is processed by reset_index method (as shown above), you
get a column of Sub-items and column of Trial-values. Let
these columns are named subs and trials respectively. In this case you can use groupby method, as follows:

# df is original dataframe (multi-indexed)
new_df = df.T.reset_index() # we suppose subs and trials are columns of new_df; may be you will need to rename column names manually
new_df.groupby(['subs', 'trials'])