Python Forum

Full Version: Help Scraping links and table from link
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
(Oct-11-2023, 07:08 AM)cartonics Wrote: [ -> ]my first idea was to take only the tags widget-results__team-name match-name and btn btn-primary btn-xs

is there something to achieve that?
>>> tag = tr[2].find(class_="widget-results__team-name match-name")
>>> tag
<span class="widget-results__team-name match-name" data-original-title="Verona" data-placement="bottom" data-toggle="tooltip">Verona</span>
>>> tag.text
'Verona'

>>> tag.attrs
{'class': ['widget-results__team-name', 'match-name'],
 'data-original-title': 'Verona',
 'data-placement': 'bottom',
 'data-toggle': 'tooltip'}
>>> tag.get('data-original-title')
'Verona'
(Oct-11-2023, 07:08 AM)cartonics Wrote: [ -> ]if there is more than one table in link for example here.
https://www.sbostats.com/soccer/league/i...-c-group-c
is it possible to scrape only the second one
It's only one table with a Title as separator,can go in take out singels vaules.
from bs4 import BeautifulSoup
import requests

parser = 'lxml'  # or 'lxml' (preferred) or 'html5lib', if installed
resp = requests.get("https://www.sbostats.com/soccer/league/italy/serie-c-group-c")
soup = BeautifulSoup(resp.content, parser)
Teste code over:
>>> soup.select_one('tr:nth-child(7)').text
'  ACR Messina   STATS        Casertana      2.38   3.00   2.90  '
>>> soup.select_one('tr:nth-child(10)').text
'  Monterosi   STATS        Audace Cerignola      2.63   3.10   2.50  '
So here use CSS selctor and take out only wanted tags.
Take a look at Web-Scraping part-1.
Quote:So here use CSS selctor and take out only wanted tags.
Take a look at Web-Scraping part-1.

awesome link with examples.. also part2 a lot of things to study!
Pages: 1 2