Python Forum
Structuring and pivoting corrupted dataframe in pandas
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Structuring and pivoting corrupted dataframe in pandas
#1
I have a dataframe which I read from an excel file. The thing is first 4 columns and its values look good. But after 5th column data seems kind of corrupted.
That is, the "dateID" values like "2021-09-06" became columns, "sourceOfData" column became ""values".
And it looks like that:

   

But i want my data to look like that:

   

The thing only came to my mind is pivot or melt. I started doing something like this:

 df2 = df.melt(var_name='dateID', value_name='productPrice') 
 df3 = df2.iloc[1:] 
in order to organize dates and prices, but I'm stuck.

Hope I explained my needs. Thanks in advance.

For those who want to reproduce my question and obtain dataframes, here is the code that consists of what i have and what i need.

import pandas as pd

whatIHave = {'countryName': ['','United States','Canada'],
        'provinceName': ['','New York','Ontario'],
		'productID': ['','35','55'],
		'productName': ['', 'Sugar', 'Corn'],
		'dateID': ['sourceOfData', 'CommissionAgent1', 'CommissionAgent1'],
		'2021-09-06': ['productPrice','2.6$','2.6$'],
		'2021-09-07': ['productPrice','5.5$','5.5$'],
		'2021-09-08': ['productPrice','3.4$','3.4$']
        }

df_whatIHave = pd.DataFrame(whatIHave, columns = ['countryName', 'provinceName', 'productID', 'productName', 'dateID', '2021-09-06', '2021-09-07', '2021-09-08'])

print(df_whatIHave)
whatINeed = {'countryName': ['United States','United States','United States', 'Canada', 'Canada', 'Canada'],
        'provinceName': ['New York','New York','New York', 'Ontario', 'Ontario', 'Ontario'],
		'productID': ['35','35','35', '55', '55', '55'],
		'productName': ['Sugar', 'Sugar', 'Sugar', 'Corn', 'Corn', 'Corn'],
		'sourceOfData': ['CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1'], 
		'dateID': ['2021-09-06', '2021-09-07', '2021-09-08', '2021-09-06', '2021-09-07', '2021-09-08'],
		'productPrice': ['2.6$','5.5$','3.4$','2.6$','5.5$','3.4$']
        }

df_whatINeed = 	pd.DataFrame(whatINeed, columns = ['countryName', 'provinceName', 'productID', 'productName', 'sourceOfData', 'dateID', 'productPrice'])

print(df_whatINeed)
Reply
#2
You have same key name in dictionary then there will be a collisions where only one survive 🚑
Look like a simple name change will give output wanted example NoteBook
Reply
#3
(Sep-18-2021, 01:11 PM)snippsat Wrote: You have same key name in dictionary then there will be a collisions where only one survive 🚑
Look like a simple name change will give output wanted example NoteBook

Oh it was misspelling thing. While i was reproducing the dataframes for people to try it I unintentionally wrote countryName twice. So the right one would be provinceName as the screenshots of whatIHave and whatINeed describes.

But the problem is not related to that. I need to replicate the countryName, provinceName, productID, productName column values through below. Then I need to add varied dateID and productPrice values next to them like the second screenshot.

How can I achieve this?
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Alteryx QS-Passing pandas dataframe column inside SQL query where condition sanky1990 0 690 Dec-04-2023, 09:48 PM
Last Post: sanky1990
  Downloaded file corrupted emont 5 765 Oct-01-2023, 11:32 AM
Last Post: snippsat
  Question on pandas.dataframe merging two colums shomikc 4 783 Jun-29-2023, 11:30 AM
Last Post: snippsat
  Structuring a large class: privite vs public methods 6hearts 3 1,017 May-05-2023, 10:06 AM
Last Post: Gribouillis
  Pandas AttributeError: 'DataFrame' object has no attribute 'concat' Sameer33 5 5,303 Feb-17-2023, 06:01 PM
Last Post: Sameer33
  help how to get size of pandas dataframe into MB\GB mg24 1 2,231 Jan-28-2023, 01:23 PM
Last Post: snippsat
  pandas dataframe into csv .... exponent issue mg24 10 1,708 Jan-20-2023, 08:15 PM
Last Post: deanhystad
  How to assign a value to pandas dataframe column rows based on a condition klllmmm 0 799 Sep-08-2022, 06:32 AM
Last Post: klllmmm
  How to retrieve records in a DataFrame (Python/Pandas) that contains leading or trail mmunozjr 3 1,697 Sep-05-2022, 11:56 AM
Last Post: Pedroski55
Sad pandas writer create "corrupted" file freko75 1 2,737 Jun-14-2022, 09:57 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020