Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Extract data from a table
Hi everyone, I am a novice with Python and learning BeautifulSoup. I understand (at least I think) the basics and have done some succesfull scraping (it's fun).
However, when trying to get the table from this site '!/instrument/PCIB.OSE/orderdepth' there's no way I can get the tags/properties right in the soup.find_all command. For several days I have been wrestling with this page with no luck. Any ideas what tag properties I should look for?

from lxml import html
import requests
url = '!/instrument/PCIB.OSE/orderdepth'

response = requests.get(url)
tree = html.fromstring(response.content)
#lxml_soup = tree.xpath('/html/head/title/text()')[0]
lxml_soup = tree.xpath('//*[@id="mainContainer"]/div[2]/ui-view/div/div/div/div/orderdepth/table/tbody/tr[1]/td[1]')[0]
Just tried with the XPATH in lxml at no avail...
Table loads async after page loaded, therefore you don't see data. And I cen't see any request in browser network tab where page get all this data. But with other tools like you can solve this problem
You can use Selenium to load your content. Check out our web scraping tutorials on how to use this.
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from import WebDriverWait
from bs4 import BeautifulSoup

option = webdriver.FirefoxOptions()

element = WebDriverWait(driver, 20).until(lambda x: x.find_elements_by_class_name("number"))
data=[price.text for price in soup.find_all('td', {'class':'number'})]
The problem was the time needed to load this dynamic page completely.
That's were WebDriverWait came in handy.
In my case, chromedriver for 85 was useless. Both Firefox and Unix did the job.
Maybe this solution helps someone.

Possibly Related Threads…
Thread Author Replies Views Last Post
  Extract data from sports betting sites nestor 4 2,496 Mar-30-2021, 04:37 PM
Last Post: Larz60+
  Inserting data from a table to another (in same db) firebird 5 821 Oct-05-2020, 06:04 AM
Last Post: buran
  Scraping a dynamic data-table in python through AJAX request filozofo 1 1,711 Aug-14-2020, 10:13 AM
Last Post: kashcode
  Extract data with Selenium and BeautifulSoup nestor 3 1,540 Jun-06-2020, 01:34 AM
Last Post: Larz60+
  Extract json-ld schema markup data and store in MongoDB Nuwan16 0 1,192 Apr-05-2020, 04:06 PM
Last Post: Nuwan16
  Extract data from a webpage cycloneseb 5 1,361 Apr-04-2020, 10:17 AM
Last Post: alekson
  Cannot Extract data through charts online AgileAVS 0 748 Feb-01-2020, 01:47 PM
Last Post: AgileAVS
  Cannot extract data from the next pages nazmulfinance 4 1,191 Nov-11-2019, 08:15 PM
Last Post: nazmulfinance
  Table data with BeatifulSoup gerry84 11 2,709 Oct-23-2019, 10:09 AM
Last Post: Larz60+
  Want to scrape a table data and export it into CSV format tahir1990 9 2,276 Oct-22-2019, 08:03 AM
Last Post: buran

Forum Jump:

User Panel Messages

Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020