Python Forum
[Python 3] - Extract specific data from a web page using lxml module
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Python 3] - Extract specific data from a web page using lxml module
#1
Hi guys,

I am trying to write a Python 3 code (using lxml module) to extract some specific data from a webpage.

A sample of the HTML data presented in the webpage is as below.
______________________________________________________________
<tr>
<td><span class="number blue">xx</span></td>
<td>001</td>
<td>002</td>
</tr>
______________________________________________________________

My code:
from lxml import html
import requests

page = requests.get("http://some_website.aspx")
tree = html.fromstring(page.content)

var_1 = tree.xpath('//span[@class="number blue"]/text()')
print(var_1)
______________________________________________________________

I am able to extract the first data (i.e. xx) and store into "var_1". However, I would also need to extract the data that are within the <td> tags of the class "number blue", and store it.

Appreciate it if someone can help to advise on this problem. Thank you.
Reply


Messages In This Thread
[Python 3] - Extract specific data from a web page using lxml module - by Takeshio - Aug-23-2018, 05:20 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  trying to save data automatically from this page thunderspeed 1 2,021 Sep-19-2021, 04:57 AM
Last Post: ndc85430
  Extract data from sports betting sites nestor 3 5,653 Mar-30-2021, 04:37 PM
Last Post: Larz60+
  Scraping a page with log in data (security, proxies) iamaghost 0 2,150 Mar-27-2021, 02:56 PM
Last Post: iamaghost
  DJANGO Looping Through Context Variable with specific data Taz 0 1,832 Feb-18-2021, 03:52 PM
Last Post: Taz
  Beautiful Soap can't find a specific section on the page Pavel_47 1 2,435 Jan-18-2021, 02:18 PM
Last Post: snippsat
  Extract data from a table Bob_M 3 2,693 Aug-14-2020, 03:36 PM
Last Post: Bob_M
  Extract data with Selenium and BeautifulSoup nestor 3 3,931 Jun-06-2020, 01:34 AM
Last Post: Larz60+
  Extract json-ld schema markup data and store in MongoDB Nuwan16 0 2,466 Apr-05-2020, 04:06 PM
Last Post: Nuwan16
  Extract data from a webpage cycloneseb 5 2,886 Apr-04-2020, 10:17 AM
Last Post: alekson
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 3,640 Mar-19-2020, 06:13 PM
Last Post: apollo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020