Posts: 4
Threads: 2
Joined: Apr 2021
Hi Forum,
I would like totransform the table from 2 columns to 3 columns structure as shown in the attached image.
How to convert the 2 columns table data into 3 columns table data using python pandas?
Thanks in advance.
Attached Files
Thumbnail(s)
Posts: 12,031
Threads: 485
Joined: Sep 2016
what have you tried?
Please show code.
Posts: 2,125
Threads: 11
Joined: May 2017
Vanilla Python:
def transpose(iterable):
return zip(*iterable)
my_data = [(1,2,3),(4,5,6),(7,8,9)]
print(list(transpose(my_data))) Output: [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
Numpy: https://numpy.org/doc/stable/reference/g...spose.html
Pandas: https://pandas.pydata.org/pandas-docs/st...spose.html
Posts: 1,838
Threads: 2
Joined: Apr 2017
The operation that the OP wants to perform isn't transposing, though. If they are using NumPy, then perhaps tile might help; I don't know what's in pandas.
Posts: 4
Threads: 2
Joined: Apr 2021
(Aug-02-2021, 10:21 AM)sankarachari Wrote: Hi Forum,
I would like totransform the table from 2 columns to 3 columns structure as shown in the attached image.
How to convert the 2 columns table data into 3 columns table data using python pandas?
Thanks in advance.
I tried like below lines of code
import pandas as pd
df = pd.read_csv('../data1.csv')
df.groupby(["Col1"]).unstack()
Posts: 1,950
Threads: 8
Joined: Jun 2018
You should look pandas pivot (if no duplicates) or pivot_table (if duplicates)
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 1,093
Threads: 143
Joined: Jul 2017
I am none too proficient with Python, so I like things clear and simple, so I can see exactly what is going on.
I think, the best way to get rid of multiple key values is a dictionary .
If you are using pandas, you must be opening a .txt or ,csv file I suppose. I've read, pandas is good for large amounts of data.
This will do what you want, just using the csv module, and I can see exactly what is happening.
You may need to adjust if, for example, key c has perhaps 5 columns, then you will need to find the longest data list and set the fieldnames according to that.
If you know in advance the maximum number of columns in the source file, no need to get complicated.
import csv
# Step 1
# read the data from a csv file
# csv.reader is annoying, it's gone if you have to repeat, so read to a data list first
# when everything is ok, you can forget data
# can't pass on csv_reader. It will be closed at the end of with ... as
# so you would need to combine Step 1, Step 2 and Step 3
with open(path + 'transpose_me.txt') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
data = []
for row in csv_reader:
data.append(row)
# returns [['a', '1'], ['a', '2'], ['b', '3'], ['b', '4'], ['c', '5'], ['c', '6']]
#Step 2
# make a dictionary with value as an empty list
mydict = {}
for d in data:
print(d[0])
key = d[0]
mydict[key] = []
#Step 3
# now put the data in mydict
for bit in data:
key = bit[0]
# possibly more than 1 data value in bit
for v in range(1, len(bit)):
print(key, bit[v])
value = bit[v]
mydict[key].append(value)
# Step 4
# simple write
# now we have the values in an easy to handle dictionary
with open(path + 'transposed.txt', mode='w') as f:
f_writer = csv.writer(f, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
fieldnames = ['key', 'data1', 'data2']
f_writer.writerow(fieldnames)
for key in mydict.keys():
list1 = [key]
values = mydict[key]
list1.extend(values)
f_writer.writerow(list1) The output looks like:
Quote:key,data1,data2
a,1,2
b,3,4
c,5,6
sankarachari likes this post
Posts: 1,950
Threads: 8
Joined: Jun 2018
Aug-05-2021, 05:38 PM
(This post was last modified: Aug-05-2021, 05:38 PM by perfringo.)
With dictionary approach one can make it simpler: read file line by line, split row to key and value and add key to dictionary, set default value type of key as list and append value to said list. No need to import anything. As this is no-effort from OP side this will not exactly match his needs, but it can be easily remedied (it assumes that pivoting_data.csv contains comma separated rows like a,1 )
with open('pivoting_data.csv', 'r') as f:
data = dict()
for line in f:
key, value = line.strip().split(',')
data.setdefault(key, []).append(value)
for key, value in data.items():
print(key, *value) Output: a 1 2
b 3 4
c 5 6
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy
Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Posts: 1,093
Threads: 143
Joined: Jul 2017
It is summer holiday.
Normally, I would be in Spain, in a beautiful mountain village, reading the paper in the square, having a coffee, later a beer, later a
walk to the top of the mountain.
But I'm stuck in China because of the virus!
So I took a look at pandas. Here is one way to solve your problem in pandas.
import pandas as pd
# open the file
csv_file = '/home/pedro/myPython/pandas/transpose_me.csv'
df = pd.read_csv(csv_file)
"""
>>> df
col1 col2
0 a 1
1 a 2
2 b 3
3 b 4
4 c 5
5 c 6
>>>
"""
# get column 1 as a list
mylist = df['col1'].tolist()
# get rid of the duplicates
myset = set(mylist)
"""
>>> myset
{'a', 'c', 'b'}
"""
# change back to list
mylist = list(myset)
mylist.sort()
# make an empty dictionary
mydict = {mylist[i]:[] for i in range(len(mylist))}
# if there were more values in the row, need to change this
# then we would need another loop to go across the row
# len(df.columns) will tell you how many columns
# but if all the rows are different, well some values will just be NaN
for i in range(df.index.size):
key = df.iat[i,0]
if key in mylist:
value = df.iat[i,1]
mydict[key].append(value)
# make a new DataFrame
df2 = pd.DataFrame(mydict)
"""
>>> df2
a b c
0 1 3 5
1 2 4 6
"""
# done
df3 = df2.T
"""
>>> df3
0 1
a 1 2
b 3 4
c 5 6
>>>
"""
|