Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
xpath returns an empty list
#1
hi, here are my two codes:

First :

page = requests.get('http://www.google.fr/')
tree = html.fromstring(page.content)
test1 = tree.xpath('/html/head/title/text()')             
test2 = tree.xpath('//*[@id="fbar"]/div/div/text()')       # xpath from 'France' which is at  bottom-left corner of the screen
print test1
print test2
Output :
Output:
['Google'] [/python]
Second :

parser = etree.HTMLParser()
html = etree.parse('http://www.google.fr/',parser)
result = html.xpath('/html/head/title/text()')
result2 = html.xpath('//*[@id="fbar"]/div/div/text()')  # xpath from 'France' which is at  bottom-left corner of the screen
print result
print result2
Output :
Output:
['Google'] []
Using google chrome, i right click on 'France' and take its xpath.
Why does it return an empty list since it doesn't for title's xpath ?
Any solution ? Thanks !
As you can see, i tried two differents ways to do it, but both does the same
Reply
#2
Which libraries do I have to import to get your program run ?
Reply
#3
sorry, i imported those ones:

from lxml import html
from lxml import etree
import urllib2
from urllib2 import urlopen
import requests
import lxml.html
Reply
#4
Sorry, I can not get "google.fr", because I always get redirected to "google.de"
Reply
#5
It's probably using JavaScripts,then need something like Selenium/PhantomJs, or reverse engineer what happens.
That say so is google start page not easy to scrape as it also has language detection and other advance stuff,
so if this is a scrape test find a other page to use.
Reply
#6
Ok, I will try.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  my soup.find_all is not finding anything: it runs into a empty-list apollo 1 5,041 May-05-2020, 07:25 PM
Last Post: snippsat
  need help with xpath pythonprogrammer 1 2,766 Jan-18-2020, 11:28 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020