Web Scraping Sportsbook Websites

Khuber79 · (This post was last modified: Mar-27-2020, 12:24 AM by Khuber79.)

Oh so then I would not need to send click commands? It would automatically expand all JavaScript on the page? That would help a lot since I could see a ton of bugs with sending click commands.

I tried by getting an error, not sure I am implementing it correctly:

import requests
import csv
from bs4 import BeautifulSoup
import urllib.request
import random
import re
from selenium import webdriver
import time

chrome_path = r"C:\Users\user\Desktop\chromedriver.exe"

Urls = []
Teams = ''

with open('M:\SportsBooks3.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:
        Urls.append(row)

FD_web = webdriver.Chrome(chrome_path)
FD_web.get(str(Urls[2])[2:-2])

# MapURL Test
FD_web.get(FD_web.mapurl)
time.sleep(2)
source = FD_web.page_source
soup = BeautifulSoup(source, 'lxml')
print(soup)
# Soup should be all expanded HTML

Error:

Error:C:\Users\user\PycharmProjects\untitled\venv\Scripts\python.exe C:/Users/user/PycharmProjects/untitled/Test_MapURL.py
Traceback (most recent call last):
File "C:/Users/user/PycharmProjects/untitled/Test_MapURL.py", line 25, in <module>
FD_web.get(FD_web.mapurl)
AttributeError: 'WebDriver' object has no attribute 'mapurl'

Process finished with exit code 1

**Larz60+** · (This post was last modified: Mar-27-2020, 05:16 AM by Larz60+.)

the error is on this FD_web.get(FD_web.mapurl)
doesn't know what mapurl is.
Has nothing to do with extracting page_source

Khuber79 · (This post was last modified: Mar-27-2020, 09:19 AM by Khuber79.)

I am not grasping the line of your code.
I assume browser is a variable that is set to your webdriver like mine is FD_Web?
What is 'self' in this? And what is mapurl?
Sorry for the newbie question, just getting a handle on python. I tried googling it and came up with nothing.

browser.get(self.mapurl)

**Larz60+** · Mar-27-2020, 05:09 PM

i's line 24 of your last post: https://python-forum.io/Thread-Web-Scrap...#pid108462
FD_web.get(FD_web.mapurl)

Khuber79 · (This post was last modified: Mar-27-2020, 07:30 PM by Khuber79.)

(Mar-26-2020, 03:44 AM)Larz60+ Wrote: I often use selenium to expand all of the JavaScript, then switch to Beautifulsoup
then you can use find, or find_all.
example:
        browser.get(self.mapurl)
        time.sleep(2)
        source = browser.page_source
        soup = BeautifulSoup(source, 'lxml')

I realize my error, I just do not understand this post. I was wondering if you could break down this code for me. In particular the first line of code.

**Larz60+** · (This post was last modified: Mar-27-2020, 09:21 PM by Larz60+.)

you should just use this part (I'm a bit confused about how you manipulate your pages)

soup = BeautifulSoup(DK_Content_FT, 'lxml')
# and/or
soup1 = BeautifulSoup(DK_Content_HT, lxml')

Khuber79 · Mar-30-2020, 11:21 PM

Ok, I took a couple days off from looking at this to do some coding with logical statements comparing the data once I have it downloaded from the sites. Now revisiting this. I realized my error and installed lxml. I assume you just use that instead of html.parser. I will research the benefits of one over the other. Is there any method to just expand all html data without sending click commands to the websites. So take this page as an example:
https://nj.unibet.com/sports/#filter/foo...1006230344

If you click on "FULL TIME", "HALF", "ASIAN LINES", etc. The HTML page gets populated with additional data for everything in the list for that game. I thought you were originally saying that the lxml would do this, but that does not appear to be the case when I downloaded that page using lxml and writing it to a text document.

If you inspect the page that I put the link up. The elements that I am trying to expand the code for are all using this:

<li class="KambiBC-bet-offer-category KambiBC-collapsible-container">
<header class="KambiBC-bet-offer-category__header" data-touch-feedback="true">
<h2 class="KambiBC-bet-offer-category__title">Full Time </h2>
<div class="KambiBC-header-meta-wrapper"><div class="KambiBC-bet-offer-category__bet-offer-count">12</div></div>
</header>
</li>

So basically trying to automatically expand this element:
<li class="KambiBC-bet-offer-category KambiBC-collapsible-container">

Whitesox1 · Mar-17-2021, 12:06 AM

I saw this old post and was looking to do same thing. Was wondering if you were able ever to figure this out for scraping the live in play odd spreads

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Webscrapping sport betting websites	KoinKoin	3	5,426	Nov-08-2023, 03:00 PM Last Post: LoriBrown
	Issue facing while scraping the data from different websites in single script.	Balamani	1	2,116	Oct-20-2020, 09:56 AM Last Post: Larz60+
	Can urlopen be blocked by websites?	peterjv26	2	3,380	Jul-26-2020, 06:45 PM Last Post: peterjv26
	Python program to write into websites for you	pythonDEV333	3	2,503	Jun-08-2020, 12:06 PM Last Post: pythonDEV333
	Scraping Websites to post on Telegram	kobryan	1	2,640	Oct-19-2019, 07:03 AM Last Post: metulburr
	Scraping Websites to post on Telegram	kobryan	0	3,423	Oct-09-2019, 04:11 PM Last Post: kobryan
	Scrapping .aspx websites	boxingowl88	3	8,241	Oct-10-2018, 05:35 PM Last Post: stranac
	Scrapper for websites	stinger	0	2,361	Jul-20-2018, 02:11 AM Last Post: stinger
	scraping javascript websites with selenium	DoctorEvil	1	3,361	Jun-08-2018, 06:40 PM Last Post: DoctorEvil
	Email extraction from websites	stefanoste78	14	12,084	Aug-18-2017, 09:44 PM Last Post: stefanoste78

Web Scraping Sportsbook Websites

User Panel Messages

Announcements