[Python 3] - Extract specific data from a web page using lxml module

Thread Rating:

0 Vote(s) - 0 Average
1
2
3
4
5

Thread Modes

[Python 3] - Extract specific data from a web page using lxml module

Takeshio
Programmer named Tim

Posts: 17

Threads: 6

Joined: Aug 2018

Reputation: 0

Aug-24-2018, 02:13 AM

(Aug-23-2018, 07:01 PM)snippsat Wrote: Remove text() from Xpath,can use .text from lxml.
Now can also take out .attrib from CSS class.

from lxml import etree

# Simulate a web page
html = '''\
<html>
  <head>
    <title>foo</title>
  </head>
  <body>
    <tr>
      <td><span class="number blue">xx</span></td>
      <td>001</td>
      <td>002</td>
    </tr>>
  </body>
</html>'''

tree = etree.fromstring(html)
span_tag = tree.xpath("//span[@class='number blue']")
print(span_tag[0].text)
print(span_tag[0].attrib.get('class'))

Output:xx
number blue

Thanks for your reply. However, I want to get the two values (i.e. 001 and 002) within the <td> tags. They all belong to the same span class (i.e. number blue).

Any idea how to get these values neatly?

Find

Messages In This Thread

[Python 3] - Extract specific data from a web page using lxml module - by Takeshio - Aug-23-2018, 05:20 PM

RE: [Python 3] - Extract specific data from a web page using lxml module - by Takeshio - Aug-23-2018, 06:41 PM

RE: [Python 3] - Extract specific data from a web page using lxml module - by snippsat - Aug-23-2018, 07:01 PM

RE: [Python 3] - Extract specific data from a web page using lxml module - by Takeshio - Aug-24-2018, 02:13 AM

RE: [Python 3] - Extract specific data from a web page using lxml module - by nilamo - Aug-23-2018, 07:46 PM

RE: [Python 3] - Extract specific data from a web page using lxml module - by snippsat - Aug-23-2018, 08:01 PM

RE: [Python 3] - Extract specific data from a web page using lxml module - by Takeshio - Aug-23-2018, 11:47 PM

RE: [Python 3] - Extract specific data from a web page using lxml module - by snippsat - Aug-23-2018, 11:56 PM

RE: [Python 3] - Extract specific data from a web page using lxml module - by Takeshio - Aug-24-2018, 07:19 AM

RE: [Python 3] - Extract specific data from a web page using lxml module - by leotrubach - Aug-25-2018, 08:46 AM

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	trying to save data automatically from this page	thunderspeed	1	2,026	Sep-19-2021, 04:57 AM Last Post: ndc85430
	Extract data from sports betting sites	nestor	3	5,663	Mar-30-2021, 04:37 PM Last Post: Larz60+
	Scraping a page with log in data (security, proxies)	iamaghost	0	2,157	Mar-27-2021, 02:56 PM Last Post: iamaghost
	DJANGO Looping Through Context Variable with specific data	Taz	0	1,840	Feb-18-2021, 03:52 PM Last Post: Taz
	Beautiful Soap can't find a specific section on the page	Pavel_47	1	2,442	Jan-18-2021, 02:18 PM Last Post: snippsat
	Extract data from a table	Bob_M	3	2,700	Aug-14-2020, 03:36 PM Last Post: Bob_M
	Extract data with Selenium and BeautifulSoup	nestor	3	3,937	Jun-06-2020, 01:34 AM Last Post: Larz60+
	Extract json-ld schema markup data and store in MongoDB	Nuwan16	0	2,476	Apr-05-2020, 04:06 PM Last Post: Nuwan16
	Extract data from a webpage	cycloneseb	5	2,894	Apr-04-2020, 10:17 AM Last Post: alekson
	use Xpath in Python :: libxml2 for a page-to-page skip-setting	apollo	2	3,652	Mar-19-2020, 06:13 PM Last Post: apollo

Users browsing this thread: 2 Guest(s)

View a Printable Version

[Python 3] - Extract specific data from a web page using lxml module

User Panel Messages

Announcements