Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How get table element
#1
Hi

How can I get table with result?

import requests
from bs4 import BeautifulSoup

page = requests.get('https://chess24.com/en/watch/live-tournaments/world-rapid-championship-2019/4/1/5')

if page.status_code == requests.codes.ok:
    bs = BeautifulSoup(page.text, 'lxml')
    tabela = bs.find('table', {'class':'items'})
    print(tabela)
Reply
#2
You can not get anything this way from this site,this is a standard problem with pages that use a lot of JavaScript.
Look at Web-scraping part-2

As we have a okay player in my country here a Notebook that dos a lot of this task,in this example getting standings table.
Also bring in Pandas to get table easier.
When using Notebook JupyterLab the table view get a lot nicer.
Reply
#3
Hi I find a solution, but how get each row like table?
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.firefox.options import Options

url = 'https://chess24.com/en/watch/live-tournaments/world-rapid-championship-2019/4/1/5'

driver = webdriver.Firefox()
driver.get(url)

parent_element = driver.find_element_by_css_selector('#tabTournamentGamesworld-rapid-championship-2019 > div.tournamentStandings.tournamentDataContainer > div > div.gridView.tournamentTable.nativeScroll > div > div > table')

#find all li childs in parent element
child = parent_element.find_elements_by_css_selector('tr')
lin = []
for i in child:
    lin.append(i.text)
    #print(i.text)

print(lin)
Reply
#4
Running my code outside of Notebook.
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
import pandas as pd
import time

#--| Setup
options = Options()
#options.add_argument("--headless")
browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options)

#--| Parse or automation
browser.get('https://chess24.com/en/watch/live-tournaments/world-rapid-championship-2019/4/1/5')
soup = BeautifulSoup(browser.page_source, 'lxml')
#browser.implicitly_wait(5)
time.sleep(4)
title = soup.select('h2.title')
print(title[0].text)
print('-'*50)

# Get table
df = pd.read_html(browser.page_source, header=None)
standings = df[2]
standings.columns = ["Rank", "Name", "Score", "Rating"]
print(standings.head(10))
Output:
FIDE World Rapid Championship -------------------------------------------------- Rank Name Score Rating 0 1 Carlsen, Magnus 8/10 2886.0 1 2 Wang, Hao 7½/10 2748.0 2 3 Duda, Jan-Krzysztof 7½/10 2751.0 3 4 Vachier-Lagrave, Maxime 7½/10 2873.0 4 5 Mamedyarov, Shakhriyar 7/10 2752.0 5 6 Le, Quang Liem 7/10 2740.0 6 7 Nepomniachtchi, Ian 7/10 2745.0 7 8 Dominguez Perez, Leinier 7/10 2755.0 8 9 Guseinov, Gadir 7/10 2691.0 9 10 Nakamura, Hikaru 7/10 2819.0
zinho Wrote:Hi I find a solution, but how get each row like table?
It's a lot more job to extract a table with with own scraping,i have done it many times in the past.
Now i use mostly Pandas for getting tables,as you see it make the task a lot easier.
Getting a correct formatted table back both in Notebook or as show over from command line.
# We have a okay player in my country
print(standings.loc[[0]])
Output:
0 1 Carlsen, Magnus 8/10 2886.0
Reply
#5
Hi snippsat

Perfect, work like charm.

Thank you!!
Reply
#6
Now that Rapid Championship is finish can show table in Notebook.
[Image: mycva4.png]
Add Unicode emoji code is this:
# We have a okay player in my country 
print('-'*50)
champ = standings.loc[[0]].Name
champ = champ.to_string()
champ = ' '.join(champ.split()[-2:])
print(f'{champ.upper():\N{sports medal}^29}')
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Selenium getting table element trengan 2 8,569 Dec-31-2018, 03:02 PM
Last Post: trengan

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020