![]() |
How to filter out Column data From Multiple rows data? - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: How to filter out Column data From Multiple rows data? (/thread-32266.html) Pages:
1
2
|
How to filter out Column data From Multiple rows data? - firaki12345 - Jan-31-2021 Good Evening Hi everyone, so i got the following JSON file from Walmart regarding their product items and price. so i loaded up jupyter notebook, imported pandas and then loaded it into a Data frame with custom columns as shown in the pics below. ![]() ![]() now this is what i want to do: make new columns named as min price and max price and load the data into it how can i do that ? Here is the code in jupyter notebook for reference. i also want the offer price as some items don't have minprice and maxprice :) EDIT: Here is the code and Json file Json File: https://pastebin.com/sLGCFCDC Python Code: import json import pandas as pd with open("walmart.json") as f: data = json.load(f) walmart = data["items"] wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"]) print(wdf.loc[0,"primaryOffer"]) pd.set_option('display.max_colwidth', None) print(wdf) RE: How to filter out Column data From Multiple rows data? - eddywinch82 - Jan-31-2021 Hi firaki12345, Try this : I hope it works for you :- wdf.join(wdf['primaryOffer'].str.split(':', 1, expand=True).rename(columns={0:'minPrice', 1:'maxPrice', 2:'offerPrice}))I hope this works for you, let me know what output you get with that code, and will try to modify it if you don't get the result you want. Edit :- I tried it and it didn't work for me, so I will work on a solution for you, sorry about this. Best Regards Eddie Winch RE: How to filter out Column data From Multiple rows data? - firaki12345 - Feb-01-2021 (Jan-31-2021, 10:00 PM)eddywinch82 Wrote: Hi firaki12345, hi, thanks for the quick reply, actually i found last night a better solution we all can use and it was hiding in plain sight. json_normalize RE: How to filter out Column data From Multiple rows data? - eddywinch82 - Feb-01-2021 Hi firaki12345, Could you share the Python Code and output, of the solution, to what you wanted to achieve ? I am very interested in knowing, if that is okay with you ? Regards Eddie Winch ![]() RE: How to filter out Column data From Multiple rows data? - eddywinch82 - Feb-01-2021 Hi firaki12345, I have achieved what you wanted doing, using str.extract a regular expression, and a couple of other changes. Here is the finished Code, let me know if you would like any other changes doing :- import json import pandas as pd import re with open("walmart json file.txt") as f: data = json.load(f) walmart = data["items"] wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"]) print(wdf.loc[0,"primaryOffer"]) pd.set_option('display.max_colwidth', None) wdf['primaryOffer'] = wdf['primaryOffer'].astype(str) wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '') wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '') wdf['minPrice'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False) wdf['maxPrice'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False) wdf['offerPrice'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False) wdf=wdf.fillna('--') wdfThe walmart json file.txt line of text, will need changing to what it is for you. Best Regards Eddie Winch ![]() RE: How to filter out Column data From Multiple rows data? - eddywinch82 - Feb-02-2021 Hi firaki12345, Here is a more cleaned up output version, of the previous Code :- I hope you like it )) import json import pandas as pd import re with open("walmart json file.txt") as f: data = json.load(f) walmart = data["items"] wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"]) print(wdf.loc[0,"primaryOffer"]) pd.set_option('display.max_colwidth', None) wdf['primaryOffer'] = wdf['primaryOffer'].astype(str) wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '') wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '') wdf['primaryOffer'] = wdf.primaryOffer.str.replace("{", '') wdf['offerId'] = wdf.primaryOffer.str.extract('.*offerId:\s?(\w+)', expand = False) wdf['minPrice(USD)'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False) wdf['maxPrice(USD)'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False) wdf['offerPrice(USD)'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False) wdf=wdf.fillna('--') wdf.drop('primaryOffer', axis=1, inplace=True) wdfRegards Eddie Winch RE: How to filter out Column data From Multiple rows data? - firaki12345 - Feb-05-2021 (Feb-02-2021, 10:02 PM)eddywinch82 Wrote: Hi firaki12345, hi very very sorry for the late reply, but as i said i found an easier solution by using json_normalize as it turns all jumbled up data into columns. so it becomes easy to use in pandas :) RE: How to filter out Column data From Multiple rows data? - eddywinch82 - Feb-05-2021 No problem firaki12345, I enjoyed coming up with the solutions, to what you wanted doing, my pleasure. Regards Eddie Winch RE: How to filter out Column data From Multiple rows data? - buran - Feb-05-2021 cross-posted https://stackoverflow.com/questions/65977198/how-to-filter-out-column-data-from-multiple-rows-data this is a great example why cross-posting is bad and at least you should let us know about it. RE: How to filter out Column data From Multiple rows data? - firaki12345 - Feb-06-2021 (Feb-05-2021, 03:47 PM)buran Wrote: cross-posted https://stackoverflow.com/questions/65977198/how-to-filter-out-column-data-from-multiple-rows-data hi, i didn't cross post. i posted my question in web scraping thats it :) |