Python Forum
How to filter out Column data From Multiple rows data?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to filter out Column data From Multiple rows data?
#1
Good Evening


Hi everyone, so i got the following JSON file from Walmart regarding their product items and price.


so i loaded up jupyter notebook, imported pandas and then loaded it into a Data frame with custom columns as shown in the pics below.


[Image: walmart1.png]

[Image: walmart2.png]


now this is what i want to do:

make new columns named as min price and max price and load the data into it

how can i do that ?


Here is the code in jupyter notebook for reference.

i also want the offer price as some items don't have minprice and maxprice :)


EDIT: Here is the code and Json file

Json File: https://pastebin.com/sLGCFCDC


Python Code:

import json
import pandas as pd
     
     
with open("walmart.json") as f:
 data = json.load(f)
     
walmart = data["items"]
     
     
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
     
     
print(wdf.loc[0,"primaryOffer"])
     
     
pd.set_option('display.max_colwidth', None)
     
print(wdf)
Reply
#2
Hi firaki12345,

Try this :

I hope it works for you :-

wdf.join(wdf['primaryOffer'].str.split(':', 1, expand=True).rename(columns={0:'minPrice', 1:'maxPrice', 2:'offerPrice}))
I hope this works for you, let me know what output you get with that code, and will try to modify it if you don't get the result you want.

Edit :- I tried it and it didn't work for me, so I will work on a solution for you, sorry about this.

Best Regards

Eddie Winch
firaki12345 likes this post
Reply
#3
(Jan-31-2021, 10:00 PM)eddywinch82 Wrote: Hi firaki12345,

Try this :

I hope it works for you :-

wdf.join(wdf['primaryOffer'].str.split(':', 1, expand=True).rename(columns={0:'minPrice', 1:'maxPrice', 2:'offerPrice}))
I hope this works for you, let me know what output you get with that code, and will try to modify it if you don't get the result you want.

Edit :- I tried it and it didn't work for me, so I will work on a solution for you, sorry about this.

Best Regards

Eddie Winch

hi, thanks for the quick reply, actually i found last night a better solution we all can use and it was hiding in plain sight.

json_normalize
Reply
#4
Hi firaki12345,

Could you share the Python Code and output, of the solution, to what you wanted to achieve ?

I am very interested in knowing, if that is okay with you ?

Regards

Eddie Winch Smile
Reply
#5
Hi firaki12345,

I have achieved what you wanted doing, using str.extract a regular expression, and a couple of other changes.

Here is the finished Code, let me know if you would like any other changes doing :-

import json
import pandas as pd
import re     
      
with open("walmart json file.txt") as f:
 data = json.load(f)
      
walmart = data["items"]
      
      
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
      
      
print(wdf.loc[0,"primaryOffer"])
      
      
pd.set_option('display.max_colwidth', None)
      

wdf['primaryOffer'] = wdf['primaryOffer'].astype(str)

wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '')

wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '')

wdf['minPrice'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['maxPrice'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['offerPrice'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False)

wdf=wdf.fillna('--')

wdf
The walmart json file.txt line of text, will need changing to what it is for you.

Best Regards

Eddie Winch Smile
Reply
#6
Hi firaki12345,

Here is a more cleaned up output version, of the previous Code :-

I hope you like it ))

import json
import pandas as pd
import re     
      
with open("walmart json file.txt") as f:
 data = json.load(f)
      
walmart = data["items"]
      
      
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
      
      
print(wdf.loc[0,"primaryOffer"])
      
      
pd.set_option('display.max_colwidth', None)
      

wdf['primaryOffer'] = wdf['primaryOffer'].astype(str)

wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '')

wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '')

wdf['primaryOffer'] = wdf.primaryOffer.str.replace("{", '') 


wdf['offerId'] = wdf.primaryOffer.str.extract('.*offerId:\s?(\w+)', expand = False)
wdf['minPrice(USD)'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['maxPrice(USD)'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['offerPrice(USD)'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False)

wdf=wdf.fillna('--')

wdf.drop('primaryOffer', axis=1, inplace=True)

wdf
Regards

Eddie Winch
firaki12345 likes this post
Reply
#7
(Feb-02-2021, 10:02 PM)eddywinch82 Wrote: Hi firaki12345,

Here is a more cleaned up output version, of the previous Code :-

I hope you like it ))

import json
import pandas as pd
import re     
      
with open("walmart json file.txt") as f:
 data = json.load(f)
      
walmart = data["items"]
      
      
wdf = pd.DataFrame(walmart,columns=["productId","primaryOffer"])
      
      
print(wdf.loc[0,"primaryOffer"])
      
      
pd.set_option('display.max_colwidth', None)
      

wdf['primaryOffer'] = wdf['primaryOffer'].astype(str)

wdf['primaryOffer'] = wdf.primaryOffer.str.replace("'", '')

wdf['primaryOffer'] = wdf.primaryOffer.str.replace(",", '')

wdf['primaryOffer'] = wdf.primaryOffer.str.replace("{", '') 


wdf['offerId'] = wdf.primaryOffer.str.extract('.*offerId:\s?(\w+)', expand = False)
wdf['minPrice(USD)'] = wdf.primaryOffer.str.extract('.*minPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['maxPrice(USD)'] = wdf.primaryOffer.str.extract('.*maxPrice:\s?(\d+[.]?\d*)', expand = False)
wdf['offerPrice(USD)'] = wdf.primaryOffer.str.extract('.*offerPrice:\s?(\d+[.]?\d*)', expand = False)

wdf=wdf.fillna('--')

wdf.drop('primaryOffer', axis=1, inplace=True)

wdf
Regards

Eddie Winch


hi very very sorry for the late reply, but as i said i found an easier solution by using json_normalize
as it turns all jumbled up data into columns. so it becomes easy to use in pandas :)
Reply
#8
No problem firaki12345,

I enjoyed coming up with the solutions, to what you wanted doing,
my pleasure.

Regards

Eddie Winch
Reply
#9
cross-posted https://stackoverflow.com/questions/6597...-rows-data

this is a great example why cross-posting is bad and at least you should let us know about it.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#10
(Feb-05-2021, 03:47 PM)buran Wrote: cross-posted https://stackoverflow.com/questions/6597...-rows-data

this is a great example why cross-posting is bad and at least you should let us know about it.

hi, i didn't cross post. i posted my question in web scraping thats it :)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Help with to check an Input list data with a data read from an external source sacharyya 3 389 Mar-09-2024, 12:33 PM
Last Post: Pedroski55
  Filter data into new dataframe as main dataframe is being populated cubangt 8 988 Oct-23-2023, 12:43 AM
Last Post: cubangt
  Returning Column and Row Data From Spreadsheet knight2000 0 431 Oct-22-2023, 07:07 AM
Last Post: knight2000
  how do you style data frame that has empty rows. gsaray101 0 522 Sep-08-2023, 05:20 PM
Last Post: gsaray101
  Database that can compress a column, or all data, automatically? Calab 3 1,155 May-22-2023, 03:25 AM
Last Post: Calab
  Code for pullng all data in a column EmBeck87 5 1,099 Apr-03-2023, 03:43 PM
Last Post: deanhystad
  (Python) Pulling data from UA Google Analytics with more than 100k rows into csv. Stockers 0 1,204 Dec-19-2022, 11:11 PM
Last Post: Stockers
  How to properly format rows and columns in excel data from parsed .txt blocks jh67 7 1,866 Dec-12-2022, 08:22 PM
Last Post: jh67
  Write sql data or CSV Data into parquet file mg24 2 2,408 Sep-26-2022, 08:21 AM
Last Post: ibreeden
  How to assign a value to pandas dataframe column rows based on a condition klllmmm 0 822 Sep-08-2022, 06:32 AM
Last Post: klllmmm

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020