Python Forum
Error while scraping item price from online store
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Error while scraping item price from online store
#1
Hi,

I am trying to scrape the price of an item from an online store. The code works with sites such as ebay.com, amazon.com etc. and many others but is not working in some cases. I am using lxml and I am providing the xpath obtained using selector gadget. The code can be seen below.

import requests
from lxml import html

pagecontent=requests.get("https://www.myntra.com/watches/fossil/fossil-women-rose-gold-toned-dial-watch-es3352i/759168/buy")
tree = html.fromstring(pagecontent.content)
data=tree.xpath('//*[contains(concat( " ", @class, " " ), concat( " ", "pdp-price", " " ))]')
print(data[0].text);
Here is the error. It can be understood from the error that data is an empty array. I would like to know how I can resolve this issue.
Error:
Traceback (most recent call last): File "scrape-test.py", line 7, in <module> print(data[0].text); IndexError: list index out of range
Version information: python 3.4.3

I appreciate the cooperation of forum members.
Reply
#2
Looks like the class you're looking for doesn't exist in the page source, but is generated by javascript.
Also, your xpath expression is looking specifically for ' pdp-price ', which wouldn't be found anyway.

The data does exist inside the javascript variable window.__myx though, so you'll probably be able to work with that.
Reply
#3
As mention bye @stranac so is data generated bye JavaScript.
So lxml alone can not read that,the simplest way is to Selenium.
Can have different drivers,here i use PhantomJS to not load a browser window.
So i send browser.page_source with the rendered JavaScript.
The can use XPath to eg take out price.
from selenium import webdriver
from lxml import html

browser = webdriver.PhantomJS()
url = 'https://www.myntra.com/watches/fossil/fossil-women-rose-gold-toned-dial-watch-es3352i/759168/buy'
browser.get(url)
tree = html.fromstring(browser.page_source)
data = tree.xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[1]/p[2]/strong')
print(data[0].text)
Output:
Rs. 6646
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  I am scraping a web page but got an Error Sarmad54 3 1,452 Mar-02-2023, 08:20 PM
Last Post: Sarmad54
  Looking to make a price checking script 6010fd12 1 1,641 Feb-02-2022, 10:06 AM
Last Post: Larz60+
  Python script missing the price PythonNewbie999 1 1,561 Sep-07-2021, 05:19 AM
Last Post: snippsat
  error in code web scraping alexisbrunaux 5 3,796 Aug-19-2020, 02:31 AM
Last Post: alexisbrunaux
  error zomato scraping data syxzetenz 3 3,359 Jun-23-2020, 08:53 PM
Last Post: Gribouillis
  Web scraping error jithin123 0 2,425 Mar-22-2020, 08:13 PM
Last Post: jithin123
  Web Scraping Error : Not getting expected result adminravi 4 2,386 Oct-08-2019, 09:53 AM
Last Post: snippsat
  Scraping data saving to DB error with Cursor cubangt 3 2,778 May-20-2019, 08:30 PM
Last Post: Yoriz
  Python Scraping Error ZenWoR 1 2,238 Sep-15-2018, 08:23 PM
Last Post: snippsat
  Error while scraping links with beautiful soup mgtheboss 4 8,358 Dec-22-2017, 12:41 PM
Last Post: mgtheboss

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020