Python Forum

Full Version: oddspedia charts
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
from this website
for example from this event:

import requests
from bs4 import BeautifulSoup

url = ""

# Effettua una richiesta GET alla pagina web
response = requests.get(url)

# Controlla se la richiesta è andata a buon fine
if response.status_code == 200:
    # Parsing dell'HTML della pagina con BeautifulSoup
    soup = BeautifulSoup(response.text, 'html.parser')

    # Trova tutti gli elementi div con la classe "event-stats-scoring-minutes__list__item"
    items = soup.find_all('div', class_='event-stats-scoring-minutes__list__item')

    for item in items:
        label = item.find('span', class_='event-stats-scoring-minutes__list__label').get_text(strip=True)
        print(f"{label}: {item.find('div', class_='event-stats-progress-bar-vertical__lines__full')['style']}")

    print(f"Errore nella richiesta. Codice di stato: {response.status_code}")
is it possible to scrape the values of the chart?

inspecting page from browser this is the part that i need:

Quote:<div class="event-stats-scoring-minutes__list"><div class="event-stats-scoring-minutes__list__item"><div class="event-stats-scoring-minutes-progressbar-wrapper"><div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 82%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 18%; background-color: rgb(0, 177, 255);"></div></div></div> <div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 96%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 4%; background-color: rgb(26, 49, 80);"></div></div></div></div> <span class="event-stats-scoring-minutes__list__label">
</span></div><div class="event-stats-scoring-minutes__list__item"><div class="event-stats-scoring-minutes-progressbar-wrapper"><div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 77%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 23%; background-color: rgb(0, 177, 255);"></div></div></div> <div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 81%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 19%; background-color: rgb(26, 49, 80);"></div></div></div></div> <span class="event-stats-scoring-minutes__list__label">
</span></div><div class="event-stats-scoring-minutes__list__item"><div class="event-stats-scoring-minutes-progressbar-wrapper"><div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 95%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 5%; background-color: rgb(0, 177, 255);"></div></div></div> <div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 77%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 23%; background-color: rgb(26, 49, 80);"></div></div></div></div> <span class="event-stats-scoring-minutes__list__label">
</span></div><div class="event-stats-scoring-minutes__list__item"><div class="event-stats-scoring-minutes-progressbar-wrapper"><div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 86%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 14%; background-color: rgb(0, 177, 255);"></div></div></div> <div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 88%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 12%; background-color: rgb(26, 49, 80);"></div></div></div></div> <span class="event-stats-scoring-minutes__list__label">
</span></div><div class="event-stats-scoring-minutes__list__item"><div class="event-stats-scoring-minutes-progressbar-wrapper"><div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 82%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 18%; background-color: rgb(0, 177, 255);"></div></div></div> <div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 88%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 12%; background-color: rgb(26, 49, 80);"></div></div></div></div> <span class="event-stats-scoring-minutes__list__label">
</span></div><div class="event-stats-scoring-minutes__list__item"><div class="event-stats-scoring-minutes-progressbar-wrapper"><div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 77%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 23%; background-color: rgb(0, 177, 255);"></div></div></div> <div class="event-stats-progress-bar-vertical"><div class="event-stats-progress-bar-vertical__lines"><div class="event-stats-progress-bar-vertical__lines__empty" style="height: 69%; background-color: rgb(225, 231, 237);"></div> <div class="event-stats-progress-bar-vertical__lines__full" style="height: 31%; background-color: rgb(26, 49, 80);"></div></div></div></div> <span class="event-stats-scoring-minutes__list__label">

Errore nella richiesta. Codice di stato: 520
Add a User-Agent.
import requests
from bs4 import BeautifulSoup
import re

url = ""
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')
stat = soup.select_one('div.event-stats-scoring-minutes__list')
tag = stat.text
print(re.findall(r'.(\d-\d{2})', tag))
['0-15', '6-30', '1-45', '6-60', '1-75', '6-90']
Thanks but my main goal is to obtain values of the charts for that minutes.. ['0-15', '0-30', '1-45', '6-60', '1-75', '0-90']

[Image: Immagine.png]

import requests
from bs4 import BeautifulSoup
import re

url = ""
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')

# Trova tutti gli elementi div con la classe "event-stats-scoring-minutes__list__item"
items = soup.find_all('div', class_='

for item in items:
label = item.find('span', class_='event-stats-scoring-minutes__list__label').get_text(strip=True)
team1_percentage = int('height:\s*(\d+)%', item.find('div', class_='event-stats-progress-bar-vertical__lines__full')['style']).group(1))

# Esempio: Stampa i risultati per la squadra 1
print(f"{label} - Team 1: {team1_percentage}%")

# Se hai bisogno anche dei valori per la squadra 2, puoi replicare il processo cambiando la classe
# Esempio: Stampa i risultati per la squadra 2
team2_percentage = int('height:\s*(\d+)%', item.find('div', class_='event-stats-progress-bar-vertical__lines__full')['style']).group(1))
print(f"{label} - Team 2: {team2_percentage}%")

maybe i solved. but i am not sure if are the real values of the chart.. there are 2 values for each column in the source.

0-15' - Team 1: 18% 0-15' - Team 2: 18% 16-30' - Team 1: 23% 16-30' - Team 2: 23% 31-45' - Team 1: 5% 31-45' - Team 2: 5% 46-60' - Team 1: 14% 46-60' - Team 2: 14% 61-75' - Team 1: 18% 61-75' - Team 2: 18% 76-90' - Team 1: 23% 76-90' - Team 2: 23%
(Dec-05-2023, 04:44 PM)nicoali Wrote: [ -> ]maybe i solved. but i am not sure if are the real values of the chart.. there are 2 values for each column in the source.
Both match have same class name,so use CSS selector to get correct percent for the second match.
import requests
from bs4 import BeautifulSoup
import re

url = ""
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')

# Trova tutti gli elementi div con la classe "event-stats-scoring-minutes__list__item"
items = soup.find_all('div', class_='event-stats-scoring-minutes__list__item')

for item in items:
    label = item.find('span', class_='event-stats-scoring-minutes__list__label').get_text(strip=True)
    team1_percentage = int('height:\s*(\d+)%', item.find('div', class_='event-stats-progress-bar-vertical__lines__full')['style']).group(1))

    # Esempio: Stampa i risultati per la squadra 1
    print(f"{label} - Team 1: {team1_percentage}%")

    # Se hai bisogno anche dei valori per la squadra 2, puoi replicare il processo cambiando la classe
    # Esempio: Stampa i risultati per la squadra 2
    team2_percentage = int('height:\s*(\d+)%', item.select_one('div:nth-child(2) > div > div.event-stats-progress-bar-vertical__lines__full')['style']).group(1))
    print(f"{label} - Team 2: {team2_percentage}%")
0-15' - Team 1: 18% 0-15' - Team 2: 4% 16-30' - Team 1: 23% 16-30' - Team 2: 19% 31-45' - Team 1: 5% 31-45' - Team 2: 23% 46-60' - Team 1: 14% 46-60' - Team 2: 12% 61-75' - Team 1: 18% 61-75' - Team 2: 12% 76-90' - Team 1: 23% 76-90' - Team 2: 31%
Thanks i scraped also other datas and i can save them in a txt

import requests
from bs4 import BeautifulSoup
import re

url = ""
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36'
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'html.parser')

# Estrai i nomi delle squadre
team1_name = soup.find('div', class_='event-stats-section-header__team--home').find('span').text.strip()
team2_name = soup.find('div', class_='event-stats-section-header__team--away').find('span').text.strip()

# Trova tutti gli elementi div con la classe "event-stats-item"
stat_items = soup.find_all('div', class_='event-stats-item')

# Inizializza dizionari per memorizzare le statistiche delle squadre
team1_stats = {}
team2_stats = {}

# Itera attraverso gli elementi estraendo le statistiche
for item in stat_items:
    label = item['label']
    home_value = item.find('div', class_='event-stats-item__home').text.strip()
    away_value = item.find('div', class_='event-stats-item__away').text.strip()

    # Memorizza le statistiche nei rispettivi dizionari
    team1_stats[label] = home_value
    team2_stats[label] = away_value

# Secondo script
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, 'lxml')

# Trova tutti gli elementi div con la classe "event-stats-scoring-minutes__list__item"
items = soup.find_all('div', class_='event-stats-scoring-minutes__list__item')

# Apri un file di testo in modalità scrittura
with open('stats.txt', 'w') as file:
    # Scrivi i nomi delle squadre
    file.write(f"Team 1: {team1_name}\n")
    file.write(f"Team 2: {team2_name}\n\n")

    # Scrivi le statistiche del Team 1
    file.write("Team 1 Stats:\n")
    file.write(f"Team 1: {team1_name}\n")
    for label, value in team1_stats.items():
        file.write(f"{label}: {value}\n")

    # Scrivi le statistiche del Team 2
    file.write("\nTeam 2 Stats:\n")
    file.write(f"Team 2: {team2_name}\n")
    for label, value in team2_stats.items():
        file.write(f"{label}: {value}\n")

    # Scrivi le statistiche del secondo script
    file.write("\nScoring Minutes Stats:\n")
    for item in items:
        label = item.find('span', class_='event-stats-scoring-minutes__list__label').get_text(strip=True)
        team1_percentage = int('height:\s*(\d+)%', item.find('div', class_='event-stats-progress-bar-vertical__lines__full')['style']).group(1))
        team2_percentage = int('height:\s*(\d+)%', item.select_one('div:nth-child(2) > div > div.event-stats-progress-bar-vertical__lines__full')['style']).group(1))

        file.write(f"{label} - Team 1: {team1_percentage}%\n")
        file.write(f"{label} - Team 2: {team2_percentage}%\n")

print("Dati salvati nel file 'stats.txt'")
and it's working i cannot save on my db mysql with:

import requests
from bs4 import BeautifulSoup
import mysql.connector

def scrape_and_insert_data(url, connection):
    # Esegui la richiesta HTTP e crea il parser HTML
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36'
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, 'html.parser')

    # Trova il titolo "Season so far"
    #season_so_far_title = soup.find('h3', class_='event-match-stats-title', text='Season so far')
    season_so_far_title = soup.find('h3', class_='event-match-stats-title', string='Season so far')
    if season_so_far_title:
        # Trova il blocco contenente le statistiche
        season_so_far_block = season_so_far_title.find_next('div', class_='event-season-so-far__body')

        # Estrai i dati delle squadre
        team1_name = season_so_far_block.find('div', class_='event-stats-section-header__team--home').find('span').text.strip()
        team2_name = season_so_far_block.find('div', class_='event-stats-section-header__team--away').find('span').text.strip()

        print(f"Team 1: {team1_name}")
        print(f"Team 2: {team2_name}")

        # Estrai le statistiche
        stat_items = season_so_far_block.find_all('div', class_='event-stats-item')

        # Costruisci un dizionario con i dati del Team 1
        team_stats_1 = {'team_name': team1_name}
        for item in stat_items:
            label = item['label']
            home_value = item.find('div', class_='event-stats-item__home').text.strip()
            team_stats_1[label.lower()] = home_value

        # Costruisci un dizionario con i dati del Team 2
        team_stats_2 = {'team_name': team2_name}
        for item in stat_items:
            label = item['label']
            away_value = item.find('div', class_='event-stats-item__away').text.strip()
            team_stats_2[label.lower()] = away_value

        # Esegui il salvataggio nel database
        save_to_database(team_stats_1, connection)
        save_to_database(team_stats_2, connection)

        print("Sezione 'Season so far' non trovata.")

def save_to_database(data, connection):
    # Crea il cursore
    cursor = connection.cursor()

    # Costruisci la query di inserimento
    insert_query = """
    INSERT INTO statistiche (
        squadra_id, posizione, vittorie, pareggi, sconfitte, gol_fatti, gol_subiti, shots, possession, 
        corners, yellow_cards, red_cards, shots_on_target, shots_off_target, free_kicks, offsides, 
        goals_by_foot, goals_by_head, goals_1st_half, goals_2nd_half, shots_blocked, 
        over_0_5, over_1_5, over_2_5, over_3_5, over_4_5, over_1_5_halftime, scored_both_halves_percent, 
        conceded_both_halves_percent, total_percent, last_5_matches_percent, home_btts_percent, 
        away_btts_percent, btts_over_2_5_percent, btts_win_percent, btts_lost_percent, 
        btts_1st_half_percent, won_to_nil_percent, lost_to_nil_percent, clean_sheets_percent, 
        over_9_5_percent, over_10_5_percent, over_11_5_percent, over_12_5_percent, over_13_5_percent, 
        over_2_5_percent, over_3_5_percent, over_4_5_percent, over_5_5_percent, over_6_5_percent
    VALUES (
        (SELECT id FROM squadre WHERE nome = %(team_name)s), 
        %(position)s, %(won)s, %(drawn)s, %(lost)s, %(goals_scored)s, %(goals_conceded)s, 
        %(shots)s, %(possession)s, %(corners)s, %(yellow_cards)s, %(red_cards)s, 
        %(shots_on_target)s, %(shots_off_target)s, %(free_kicks)s, %(offsides)s, 
        %(goals_by_foot)s, %(goals_by_head)s, %(goals_1st_half)s, %(goals_2nd_half)s, %(shots_blocked)s, 
        %(over_0_5)s, %(over_1_5)s, %(over_2_5)s, %(over_3_5)s, %(over_4_5)s, %(over_1_5_halftime)s, 
        %(scored_both_halves_percent)s, %(conceded_both_halves_percent)s, %(total_percent)s, 
        %(last_5_matches_percent)s, %(home_btts_percent)s, %(away_btts_percent)s, %(btts_over_2_5_percent)s, 
        %(btts_win_percent)s, %(btts_lost_percent)s, %(btts_1st_half_percent)s, %(won_to_nil_percent)s, 
        %(lost_to_nil_percent)s, %(clean_sheets_percent)s, %(over_9_5_percent)s, %(over_10_5_percent)s, 
        %(over_11_5_percent)s, %(over_12_5_percent)s, %(over_13_5_percent)s, %(over_2_5_percent)s, 
        %(over_3_5_percent)s, %(over_4_5_percent)s, %(over_5_5_percent)s, %(over_6_5_percent)s

    # Esegui la query di inserimento
    cursor.execute(insert_query, data)

    # Conferma l'operazione

    # Chiudi il cursore

# Sostituisci con le tue credenziali
connection = mysql.connector.connect(

# URL del match
url = ""

# Esegui lo scraping e inserisci nel database
scrape_and_insert_data(url, connection)

# Chiudi la connessione al database
Sezione 'Season so far' non trovata.
i've a problem tryng to scrapeing data from live matches

import requests
from bs4 import BeautifulSoup

url = ""
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36'}
response = requests.get(url, headers=headers)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, 'html.parser')

    # Trova tutti gli elementi div con la classe "match-status--inplay", "match-timer", o "match-score--inplay"
    matches = soup.find_all('div', class_=lambda value: value and ('match-status--inplay' in value or 'match-timer' in value or 'match-score--inplay' in value))

    for match in matches:
        # Trova l'elemento padre "a" contenente i dati desiderati
        match_info = match.find_parent('a')

        # Estrai i dati desiderati dalla struttura HTML
        match_url = match_info['href']
        teams = match_info.find_all('div', class_='match-team__name')
        team1 = teams[0].text.strip()
        team2 = teams[1].text.strip()

        # Estrai le immagini
        team_images = match_info.find_all('img', class_='logo-team')
        team1_image = team_images[0]['src']
        team2_image = team_images[1]['src']

        # Verifica se è presente il timer
        match_status = match.find('div', class_='match-status')
        match_timer = match.find('div', class_='match-timer')
        match_score = match.find('div', class_='match-score--inplay')

        status_text = match_status.text.strip() if match_status else "N/A"
        timer_text = match_timer.text.strip() if match_timer else "N/A"
        score_home = match_score.find('div', class_='match-score__team--home').text.strip() if match_score else "N/A"
        score_away = match_score.find('div', class_='match-score__team--away').text.strip() if match_score else "N/A"

        # Stampai dati 
        print(f"URL: {match_url}")
        print(f"Team 1: {team1}")
        print(f"Team 1 Image: {team1_image}")
        print(f"Team 2: {team2}")
        print(f"Team 2 Image: {team2_image}")
        print(f"Status: {status_text}")
        print(f"Timer: {timer_text}")
        print(f"Score Home: {score_home}")
        print(f"Score Away: {score_away}")

    print(f"Errore nella richiesta. Codice di stato: {response.status_code}")
no errors but no values
(Dec-07-2023, 04:47 PM)nicoali Wrote: [ -> ]i've a problem tryng to scrapeing data from live matches
Data is generated bye JavaScript,so have to use other tool like Selenium or Playwright look at this post
Quote:and it's working i cannot save on my db mysql with:
Simplify it before adding lot of stuff and remove the else:,want to see error in the long if block.
If get it to work with1-2 values to mysql,the is easier to to see it not work anymore when add more stuff
(Dec-07-2023, 06:51 PM)snippsat Wrote: [ -> ]Data is generated bye JavaScript,so have to use other tool like Selenium or Playwright look at this post

i was able with selenium to take names of teams but not url and urls images and score
(Dec-07-2023, 08:01 PM)nicoali Wrote: [ -> ]i was able with selenium to take names of teams but not url and urls images and score
You most show what have tried with code.

from selenium import webdriver
from import Options
from import Service
from import By
import time

# Setup
options = Options()
ser = Service(r"C:\cmder\bin\chromedriver.exe")
browser = webdriver.Chrome(service=ser, options=options)
# Parse or automation
url = ''
game = browser.find_element(By.CSS_SELECTOR, 'main > div:nth-child(3) > div.ml__wrap > div:nth-child(16) > div')
img = browser.find_element(By.CSS_SELECTOR, 'main > div:nth-child(3) > div.ml__wrap > div:nth-child(16) > div > a > div.match-teams > div:nth-child(1) > img')
HT USM Khenchela MCE Baydh 1 0
(Dec-08-2023, 03:09 PM)snippsat Wrote: [ -> ]
HT USM Khenchela MCE Baydh 1 0
yes i had same problem for images real images are something like that

for example

<div class="match-team"><img alt="Juventus" width="20" height="20" class="logo-team match-team__logo image-size-sm lazyLoad isLoaded" src=""> <div class="match-team__name">
    </div> <div class="match-team__widgets"><!----> <!----></div></div>
while output scraping is always
Pages: 1 2 3