Python Forum

Hi Forum,

I would like totransform the table from 2 columns to 3 columns structure as shown in the attached image.
How to convert the 2 columns table data into 3 columns table data using python pandas?

Thanks in advance.

what have you tried?
Please show code.

Vanilla Python:

def transpose(iterable):
    return zip(*iterable)

my_data = [(1,2,3),(4,5,6),(7,8,9)]
print(list(transpose(my_data)))

Output:
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

Numpy: https://numpy.org/doc/stable/reference/g...spose.html
Pandas: https://pandas.pydata.org/pandas-docs/st...spose.html

The operation that the OP wants to perform isn't transposing, though. If they are using NumPy, then perhaps tile might help; I don't know what's in pandas.

(Aug-02-2021, 10:21 AM)sankarachari Wrote: [ -> ]Hi Forum,

I would like totransform the table from 2 columns to 3 columns structure as shown in the attached image.
How to convert the 2 columns table data into 3 columns table data using python pandas?

Thanks in advance.

I tried like below lines of code
import pandas as pd
df = pd.read_csv('../data1.csv')
df.groupby(["Col1"]).unstack()

You should look pandas pivot (if no duplicates) or pivot_table (if duplicates)

I am none too proficient with Python, so I like things clear and simple, so I can see exactly what is going on.

I think, the best way to get rid of multiple key values is a dictionary .

If you are using pandas, you must be opening a .txt or ,csv file I suppose. I've read, pandas is good for large amounts of data.

This will do what you want, just using the csv module, and I can see exactly what is happening.

You may need to adjust if, for example, key c has perhaps 5 columns, then you will need to find the longest data list and set the fieldnames according to that.

If you know in advance the maximum number of columns in the source file, no need to get complicated.

 
import csv
# Step 1
# read the data from a csv file
# csv.reader is annoying, it's gone if you have to repeat, so read to a data list first
# when everything is ok, you can forget data
# can't pass on csv_reader. It will be closed at the end of with ... as
# so you would need to combine Step 1, Step 2 and Step 3

with open(path + 'transpose_me.txt') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    data = []
    for row in csv_reader:
        data.append(row)
# returns [['a', '1'], ['a', '2'], ['b', '3'], ['b', '4'], ['c', '5'], ['c', '6']]

#Step 2
# make a dictionary with value as an empty list
mydict = {}
for d in data:
    print(d[0])
    key = d[0]
    mydict[key] = []

#Step 3
# now put the data in mydict
for bit in data:
    key = bit[0]    
    # possibly more than 1 data value in bit
    for v in range(1, len(bit)):
        print(key, bit[v])
        value = bit[v] 
        mydict[key].append(value)

# Step 4
# simple write
# now we have the values in an easy to handle dictionary

with open(path + 'transposed.txt', mode='w') as f:
    f_writer = csv.writer(f, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
    fieldnames = ['key', 'data1', 'data2']
    f_writer.writerow(fieldnames)
    for key in mydict.keys(): 
        list1 = [key]
        values = mydict[key]
        list1.extend(values)
        f_writer.writerow(list1)

The output looks like:

Quote:key,data1,data2
a,1,2
b,3,4
c,5,6

With dictionary approach one can make it simpler: read file line by line, split row to key and value and add key to dictionary, set default value type of key as list and append value to said list. No need to import anything. As this is no-effort from OP side this will not exactly match his needs, but it can be easily remedied (it assumes that pivoting_data.csv contains comma separated rows like a,1)

with open('pivoting_data.csv', 'r') as f:
    data = dict()
    for line in f:
        key, value = line.strip().split(',')
        data.setdefault(key, []).append(value)

for key, value in data.items():
    print(key, *value)

Output:a 1 2
b 3 4
c 5 6

It is summer holiday.

Normally, I would be in Spain, in a beautiful mountain village, reading the paper in the square, having a coffee, later a beer, later a
walk to the top of the mountain.

But I'm stuck in China because of the virus!

So I took a look at pandas. Here is one way to solve your problem in pandas.

import pandas as pd

# open the file
csv_file = '/home/pedro/myPython/pandas/transpose_me.csv'
df = pd.read_csv(csv_file)

"""
>>> df
  col1  col2
0    a     1
1    a     2
2    b     3
3    b     4
4    c     5
5    c     6
>>>

"""
# get column 1 as a list
mylist = df['col1'].tolist()

# get rid of the duplicates
myset = set(mylist)
"""
>>> myset
{'a', 'c', 'b'}
"""
# change back to list
mylist = list(myset)
mylist.sort()

# make an empty dictionary
mydict = {mylist[i]:[] for i in range(len(mylist))}

# if there were more values in the row, need to change this
# then we would need another loop to go across the row
# len(df.columns) will tell you how many columns
# but if all the rows are different, well some values will just be NaN
for i in range(df.index.size):
    key = df.iat[i,0]
    if key in mylist:
        value = df.iat[i,1]
        mydict[key].append(value)

# make a new DataFrame
df2 = pd.DataFrame(mydict)
"""
>>> df2
   a  b  c
0  1  3  5
1  2  4  6
"""
# done
df3 = df2.T
"""
>>> df3
   0  1
a  1  2
b  3  4
c  5  6
>>>
"""

sankarachari

Larz60+

DeaD_EyE

ndc85430

sankarachari

perfringo

Pedroski55

perfringo

Pedroski55