![]() |
Structuring and pivoting corrupted dataframe in pandas - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Structuring and pivoting corrupted dataframe in pandas (/thread-34947.html) |
Structuring and pivoting corrupted dataframe in pandas - gunner1905 - Sep-18-2021 I have a dataframe which I read from an excel file. The thing is first 4 columns and its values look good. But after 5th column data seems kind of corrupted. That is, the "dateID" values like "2021-09-06" became columns, "sourceOfData" column became ""values". And it looks like that: [attachment=1269] But i want my data to look like that: [attachment=1270] The thing only came to my mind is pivot or melt. I started doing something like this: df2 = df.melt(var_name='dateID', value_name='productPrice') df3 = df2.iloc[1:]in order to organize dates and prices, but I'm stuck. Hope I explained my needs. Thanks in advance. For those who want to reproduce my question and obtain dataframes, here is the code that consists of what i have and what i need. import pandas as pd whatIHave = {'countryName': ['','United States','Canada'], 'provinceName': ['','New York','Ontario'], 'productID': ['','35','55'], 'productName': ['', 'Sugar', 'Corn'], 'dateID': ['sourceOfData', 'CommissionAgent1', 'CommissionAgent1'], '2021-09-06': ['productPrice','2.6$','2.6$'], '2021-09-07': ['productPrice','5.5$','5.5$'], '2021-09-08': ['productPrice','3.4$','3.4$'] } df_whatIHave = pd.DataFrame(whatIHave, columns = ['countryName', 'provinceName', 'productID', 'productName', 'dateID', '2021-09-06', '2021-09-07', '2021-09-08']) print(df_whatIHave) whatINeed = {'countryName': ['United States','United States','United States', 'Canada', 'Canada', 'Canada'], 'provinceName': ['New York','New York','New York', 'Ontario', 'Ontario', 'Ontario'], 'productID': ['35','35','35', '55', '55', '55'], 'productName': ['Sugar', 'Sugar', 'Sugar', 'Corn', 'Corn', 'Corn'], 'sourceOfData': ['CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1', 'CommissionAgent1'], 'dateID': ['2021-09-06', '2021-09-07', '2021-09-08', '2021-09-06', '2021-09-07', '2021-09-08'], 'productPrice': ['2.6$','5.5$','3.4$','2.6$','5.5$','3.4$'] } df_whatINeed = pd.DataFrame(whatINeed, columns = ['countryName', 'provinceName', 'productID', 'productName', 'sourceOfData', 'dateID', 'productPrice']) print(df_whatINeed) RE: Structuring and pivoting corrupted dataframe in pandas - snippsat - Sep-18-2021 You have same key name in dictionary then there will be a collisions where only one survive 🚑 Look like a simple name change will give output wanted example NoteBook RE: Structuring and pivoting corrupted dataframe in pandas - gunner1905 - Sep-18-2021 (Sep-18-2021, 01:11 PM)snippsat Wrote: You have same key name in dictionary then there will be a collisions where only one survive 🚑 Oh it was misspelling thing. While i was reproducing the dataframes for people to try it I unintentionally wrote countryName twice. So the right one would be provinceName as the screenshots of whatIHave and whatINeed describes. But the problem is not related to that. I need to replicate the countryName, provinceName, productID, productName column values through below. Then I need to add varied dateID and productPrice values next to them like the second screenshot. How can I achieve this? |