Error while scraping item price from online store

mgtheboss · (This post was last modified: Jan-12-2018, 05:14 PM by mgtheboss.)

Hi,

I am trying to scrape the price of an item from an online store. The code works with sites such as ebay.com, amazon.com etc. and many others but is not working in some cases. I am using lxml and I am providing the xpath obtained using selector gadget. The code can be seen below.

import requests
from lxml import html

pagecontent=requests.get("https://www.myntra.com/watches/fossil/fossil-women-rose-gold-toned-dial-watch-es3352i/759168/buy")
tree = html.fromstring(pagecontent.content)
data=tree.xpath('//*[contains(concat( " ", @class, " " ), concat( " ", "pdp-price", " " ))]')
print(data[0].text);

Here is the error. It can be understood from the error that data is an empty array. I would like to know how I can resolve this issue.

Error:Traceback (most recent call last):
  File "scrape-test.py", line 7, in <module>
    print(data[0].text);
IndexError: list index out of range

Version information: python 3.4.3

I appreciate the cooperation of forum members.

***stranac*** · Jan-12-2018, 05:59 PM

Looks like the class you're looking for doesn't exist in the page source, but is generated by javascript.
Also, your xpath expression is looking specifically for ' pdp-price ', which wouldn't be found anyway.

The data does exist inside the javascript variable window.__myx though, so you'll probably be able to work with that.

***snippsat*** · (This post was last modified: Jan-12-2018, 06:43 PM by snippsat.)

As mention bye @stranac so is data generated bye JavaScript.
So lxml alone can not read that,the simplest way is to Selenium.
Can have different drivers,here i use PhantomJS to not load a browser window.
So i send browser.page_source with the rendered JavaScript.
The can use XPath to eg take out price.

from selenium import webdriver
from lxml import html

browser = webdriver.PhantomJS()
url = 'https://www.myntra.com/watches/fossil/fossil-women-rose-gold-toned-dial-watch-es3352i/759168/buy'
browser.get(url)
tree = html.fromstring(browser.page_source)
data = tree.xpath('//*[@id="mountRoot"]/div/div/main/div[2]/div[2]/div[1]/p[2]/strong')
print(data[0].text)

Output:
Rs. 6646

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	I am scraping a web page but got an Error	Sarmad54	3	1,452	Mar-02-2023, 08:20 PM Last Post: Sarmad54
	Looking to make a price checking script	6010fd12	1	1,641	Feb-02-2022, 10:06 AM Last Post: Larz60+
	Python script missing the price	PythonNewbie999	1	1,561	Sep-07-2021, 05:19 AM Last Post: snippsat
	error in code web scraping	alexisbrunaux	5	3,796	Aug-19-2020, 02:31 AM Last Post: alexisbrunaux
	error zomato scraping data	syxzetenz	3	3,359	Jun-23-2020, 08:53 PM Last Post: Gribouillis
	Web scraping error	jithin123	0	2,425	Mar-22-2020, 08:13 PM Last Post: jithin123
	Web Scraping Error : Not getting expected result	adminravi	4	2,386	Oct-08-2019, 09:53 AM Last Post: snippsat
	Scraping data saving to DB error with Cursor	cubangt	3	2,778	May-20-2019, 08:30 PM Last Post: Yoriz
	Python Scraping Error	ZenWoR	1	2,238	Sep-15-2018, 08:23 PM Last Post: snippsat
	Error while scraping links with beautiful soup	mgtheboss	4	8,358	Dec-22-2017, 12:41 PM Last Post: mgtheboss

Error while scraping item price from online store

User Panel Messages

Announcements