Sep-06-2017, 09:08 PM
(This post was last modified: Sep-06-2017, 09:08 PM by Prince_Bhatia.)
hi,
i am trying to scrape the website "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page=1&PageSize=36&order=BESTMATCH"
what i am trying to do scrape, product name, it's price and image link
i got the success a bit with one problem, name, price and image are coming in every cell, like formatting is so poor.
can someone help me to ammend codes so that i can get name in name column, price in price column and image in image column.
i am trying to scrape the website "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page=1&PageSize=36&order=BESTMATCH"
what i am trying to do scrape, product name, it's price and image link
i got the success a bit with one problem, name, price and image are coming in every cell, like formatting is so poor.
can someone help me to ammend codes so that i can get name in name column, price in price column and image in image column.
from urllib.request import urlopen from bs4 import BeautifulSoup #page_url = "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page=1&PageSize=36&order=BESTMATCH" #html = urlopen(page_url) #bs0bj = BeautifulSoup(html, "html.parser") #page_details = bs0bj.find_all("div", {"class":"item-container"}) f = open("Scrapedetails.csv", "w") Headers = "Item_Name, Price, Image\n" f.write(Headers) #for i in page_details: # Item_Name = i.find("a", {"class":"item-title"}) # Price = i.find("li", {"class":"price-current"}) # Image = i.find("img") # Name_item = Item_Name.get_text() # Prin = Price.get_text() # imgf = Image["src"]# to get the key src # f.write("{}".format(Name_item)+ ",{}".format(Prin)+ ",{}".format(imgf)) #f.close() for page in range(1,15): page_url = "https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=-1&IsNodeId=1&Description=GTX&bop=And&Page={}&PageSize=36&order=BESTMATCH".format(page) html = urlopen(page_url) bs0bj = BeautifulSoup(html, "html.parser") page_details = bs0bj.find_all("div", {"class":"item-container"}) for i in page_details: Item_Name = i.find("a", {"class":"item-title"}) Price = i.find("li", {"class":"price-current"}) Image = i.find("img") Name_item = Item_Name.get_text() Prin = Price.get_text() imgf = Image["src"]# to get the key src f.write("{}".format(Name_item)+ ",{}".format(Prin)+ ",{}".format(imgf)+ "\n") f.close()i am attaching the excel file too and what are the new ways to save data in csv ,can someone help me in it with codes too?