Posts: 7
Threads: 4
Joined: Jan 2021
Jan-31-2021, 02:11 PM
(This post was last modified: Jan-31-2021, 02:11 PM by firaki12345.)
Good Evening
Hi everyone, so i got the following JSON file from Walmart regarding their product items and price.
so i loaded up jupyter notebook, imported pandas and then loaded it into a Data frame with custom columns as shown in the pics below.
now this is what i want to do:
make new columns named as min price and max price and load the data into it
how can i do that ?
Here is the code in jupyter notebook for reference.
i also want the offer price as some items don't have minprice and maxprice :)
EDIT: Here is the code and Json file
Json File: https://pastebin.com/sLGCFCDC
Python Code:
import json
import pandas as pd
with open("walmart.json") as f:
data = json.load(f)
walmart = data["items"]
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
print(wdf.loc[0,"primaryOffer"])
pd.set_option('display.max_colwidth', None)
print(wdf)
Posts: 218
Threads: 27
Joined: May 2018
Jan-31-2021, 10:00 PM
(This post was last modified: Jan-31-2021, 10:00 PM by eddywinch82.)
Hi firaki12345,
Try this :
I hope it works for you :-
wdf.join(wdf['primaryOffer'].str.split(':', 1, expand=True).rename(columns={0:'minPrice', 1:'maxPrice', 2:'offerPrice})) I hope this works for you, let me know what output you get with that code, and will try to modify it if you don't get the result you want.
Edit :- I tried it and it didn't work for me, so I will work on a solution for you, sorry about this.
Best Regards
Eddie Winch
firaki12345 likes this post
Posts: 7
Threads: 4
Joined: Jan 2021
(Jan-31-2021, 10:00 PM)eddywinch82 Wrote: Hi firaki12345,
Try this :
I hope it works for you :-
wdf.join(wdf['primaryOffer'].str.split(':', 1, expand=True).rename(columns={0:'minPrice', 1:'maxPrice', 2:'offerPrice})) I hope this works for you, let me know what output you get with that code, and will try to modify it if you don't get the result you want.
Edit :- I tried it and it didn't work for me, so I will work on a solution for you, sorry about this.
Best Regards
Eddie Winch
hi, thanks for the quick reply, actually i found last night a better solution we all can use and it was hiding in plain sight.
json_normalize
Posts: 218
Threads: 27
Joined: May 2018
Feb-01-2021, 02:38 PM
(This post was last modified: Feb-01-2021, 02:38 PM by eddywinch82.)
Hi firaki12345,
Could you share the Python Code and output, of the solution, to what you wanted to achieve ?
I am very interested in knowing, if that is okay with you ?
Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Feb-01-2021, 10:39 PM
(This post was last modified: Feb-01-2021, 10:39 PM by eddywinch82.)
Hi firaki12345,
I have achieved what you wanted doing, using str.extract a regular expression, and a couple of other changes.
Here is the finished Code, let me know if you would like any other changes doing :-
import json
import pandas as pd
import re
with open("walmart json file.txt") as f:
data = json.load(f)
walmart = data["items"]
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
print(wdf.loc[0,"primaryOffer"])
pd.set_option('display.max_colwidth', None)
wdf['primaryOffer'] = wdf['primaryOffer'].astype(str)
wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '')
wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '')
wdf['minPrice'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['maxPrice'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['offerPrice'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False)
wdf=wdf.fillna('--')
wdf The walmart json file.txt line of text, will need changing to what it is for you.
Best Regards
Eddie Winch
Posts: 218
Threads: 27
Joined: May 2018
Feb-02-2021, 10:02 PM
(This post was last modified: Feb-02-2021, 10:02 PM by eddywinch82.)
Hi firaki12345,
Here is a more cleaned up output version, of the previous Code :-
I hope you like it ))
import json
import pandas as pd
import re
with open("walmart json file.txt") as f:
data = json.load(f)
walmart = data["items"]
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
print(wdf.loc[0,"primaryOffer"])
pd.set_option('display.max_colwidth', None)
wdf['primaryOffer'] = wdf['primaryOffer'].astype(str)
wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '')
wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '')
wdf['primaryOffer'] = wdf.primaryOffer.str.replace("{", '')
wdf['offerId'] = wdf.primaryOffer.str.extract('.*offerId:\s?(\w+)', expand = False)
wdf['minPrice(USD)'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['maxPrice(USD)'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['offerPrice(USD)'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False)
wdf=wdf.fillna('--')
wdf.drop('primaryOffer', axis=1, inplace=True)
wdf Regards
Eddie Winch
firaki12345 likes this post
Posts: 7
Threads: 4
Joined: Jan 2021
(Feb-02-2021, 10:02 PM)eddywinch82 Wrote: Hi firaki12345,
Here is a more cleaned up output version, of the previous Code :-
I hope you like it ))
import json
import pandas as pd
import re
with open("walmart json file.txt") as f:
data = json.load(f)
walmart = data["items"]
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
print(wdf.loc[0,"primaryOffer"])
pd.set_option('display.max_colwidth', None)
wdf['primaryOffer'] = wdf['primaryOffer'].astype(str)
wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '')
wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '')
wdf['primaryOffer'] = wdf.primaryOffer.str.replace("{", '')
wdf['offerId'] = wdf.primaryOffer.str.extract('.*offerId:\s?(\w+)', expand = False)
wdf['minPrice(USD)'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['maxPrice(USD)'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['offerPrice(USD)'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False)
wdf=wdf.fillna('--')
wdf.drop('primaryOffer', axis=1, inplace=True)
wdf Regards
Eddie Winch
hi very very sorry for the late reply, but as i said i found an easier solution by using json_normalize
as it turns all jumbled up data into columns. so it becomes easy to use in pandas :)
Posts: 218
Threads: 27
Joined: May 2018
No problem firaki12345,
I enjoyed coming up with the solutions, to what you wanted doing,
my pleasure.
Regards
Eddie Winch
Posts: 8,159
Threads: 160
Joined: Sep 2016
Feb-05-2021, 03:47 PM
(This post was last modified: Feb-05-2021, 03:48 PM by buran.)
cross-posted https://stackoverflow.com/questions/6597...-rows-data
this is a great example why cross-posting is bad and at least you should let us know about it.
Posts: 7
Threads: 4
Joined: Jan 2021
(Feb-05-2021, 03:47 PM)buran Wrote: cross-posted https://stackoverflow.com/questions/6597...-rows-data
this is a great example why cross-posting is bad and at least you should let us know about it.
hi, i didn't cross post. i posted my question in web scraping thats it :)
|