Python Forum

Hello,

I'm not very good at XPath, and am a bit lost at the syntax to 1) find an element based on the value of its first attribute, and grab the text of the second attribute in an HTML file:

<meta name="description" content="Blah"/>
<meta name="keywords" content="blah"/>
<meta name="classification" content="other"/>

description = root.find('./head/meta[@description]')
print(description.text)

Thank you.

--
Edit: Getting closer

description=root.xpath("//meta[@name='description' and @content]")
#BAD print(description.text) #'list' object has no attribute 'text'

you didn't say what method you arte using to find description
with selenium, use:

description = browser.find_element(by=By.XPATH, value="//meta[@name='description']").text

note you may have to replace browser with 'driver' or whatever you opened selenium with.

I would not use ElementTree for parsing html,look at Web-Scraping part-1

from bs4 import BeautifulSoup

data = '''\
<html>
  <meta name="description" content="Blah"/>
  <meta name="keywords" content="blah"/>
  <meta name="classification" content="other"/>
<html>'''

soup = BeautifulSoup(data, 'lxml')
tag = soup.find('meta', {'name': 'keywords'})

>>> tag
<meta content="blah" name="keywords"/>
>>> tag.attrs
{'content': 'blah', 'name': 'keywords'}
>>> tag.attrs.get('content')
'blah'

If want to use XPath i would use lxml.

from lxml import etree

data = '''\
<html>
  <meta name="description" content="Blah"/>
  <meta name="keywords" content="blah"/>
  <meta name="classification" content="other"/>
</html>'''

tree = etree.fromstring(data)
tag = tree.xpath("//meta[@name='classification']/@content")
print(tag[0])

Output:
other

Thanks much. I forgot to say I'm actually using lxml, and the code above solved it.

Winfried

Larz60+

snippsat

Winfried