Help Scraping links and table from link

***snippsat*** · (This post was last modified: Oct-10-2023, 03:44 PM by snippsat.)

(Oct-10-2023, 01:32 PM)cartonics Wrote: A stupid question... why if in the source code in the link there is

serie-a&quote
scraping become
=serie-a"e

is it a problem of encoding ??

Yes,and the reason is your code 😉
Remove the encoding stuff you start with and use lxml as parser,then the links will work.

from bs4 import BeautifulSoup
from bs4.dammit import EncodingDetector
import requests

parser = 'lxml'  # or 'lxml' (preferred) or 'html5lib', if installed
resp = requests.get("https://www.sbostats.com/soccer/league/italy/serie-a")
soup = BeautifulSoup(resp.content, parser)

table = soup.find_all('table', attrs={'class':'updated_next_results_table'})
table = table[0]
tr = table.find_all('tr')
base_url = '*https://www.sbostats.com'
with open('matches.txt', 'a') as fp:
    for row in tr:
        if row.text == None:
            pass
        if row.find('a') == None:
            pass
        else:
            #print(' '.join(row.text.replace('STATS', '-').split()[:3]))
            #print(f"{base_url}{row.find('a')['href']}\n")
            fp.write(f"{' '.join(row.text.replace('STATS', '-').split()[:3])}\n")
            fp.write(f"{base_url}{row.find('a')['href']}\n\n")

Output:Verona - Napoli
*https://www.sbostats.com/soccer/stats?country=italy&league=serie-a&quote=1.50&direction=away&id=NDAxMTg3OA==

Torino - Inter
*https://www.sbostats.com/soccer/stats?country=italy&league=serie-a&quote=1.83&direction=away&id=NDAxMTg3OQ==

Sassuolo - Lazio
*https://www.sbostats.com/soccer/stats?country=italy&league=serie-a&quote=2.30&direction=away&id=NDAxMTg4MA==

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Scraping data from table into existing dataframe	vincer58	1	2,159	Jan-09-2022, 05:15 PM Last Post: vincer58
	Need help scraping wikipedia table	bborusz2	6	3,489	Dec-01-2020, 11:31 PM Last Post: snippsat
	Web Scraping Inquiry (Extracting content from a table in asubdomain)	DustinKlent	3	3,900	Aug-17-2020, 10:10 AM Last Post: snippsat
	Scraping a dynamic data-table in python through AJAX request	filozofo	1	4,049	Aug-14-2020, 10:13 AM Last Post: kashcode
	scraping multiple pages from table	bandar	1	2,853	Jun-27-2020, 10:43 PM Last Post: Larz60+
	get link and link text from table	metulburr	5	6,577	Jun-13-2019, 07:50 PM Last Post: snippsat
	webscrapping links and then enter those links to scrape data	kirito85	2	3,414	Jun-13-2019, 02:23 AM Last Post: kirito85
	Error while scraping links with beautiful soup	mgtheboss	4	8,686	Dec-22-2017, 12:41 PM Last Post: mgtheboss
	Web scraping "fancy" table	acehole60	2	5,053	Dec-16-2016, 09:17 AM Last Post: acehole60

Help Scraping links and table from link

User Panel Messages

Announcements