Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Web Scraping Sportsbook Websites
#11
Oh so then I would not need to send click commands? It would automatically expand all JavaScript on the page? That would help a lot since I could see a ton of bugs with sending click commands.

I tried by getting an error, not sure I am implementing it correctly:

import requests
import csv
from bs4 import BeautifulSoup
import urllib.request
import random
import re
from selenium import webdriver
import time

chrome_path = r"C:\Users\user\Desktop\chromedriver.exe"

Urls = []
Teams = ''

with open('M:\SportsBooks3.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:
        Urls.append(row)

FD_web = webdriver.Chrome(chrome_path)
FD_web.get(str(Urls[2])[2:-2])

# MapURL Test
FD_web.get(FD_web.mapurl)
time.sleep(2)
source = FD_web.page_source
soup = BeautifulSoup(source, 'lxml')
print(soup)
# Soup should be all expanded HTML
Error:

Error:
C:\Users\user\PycharmProjects\untitled\venv\Scripts\python.exe C:/Users/user/PycharmProjects/untitled/Test_MapURL.py Traceback (most recent call last): File "C:/Users/user/PycharmProjects/untitled/Test_MapURL.py", line 25, in <module> FD_web.get(FD_web.mapurl) AttributeError: 'WebDriver' object has no attribute 'mapurl' Process finished with exit code 1
Reply
#12
the error is on this FD_web.get(FD_web.mapurl)
doesn't know what mapurl is.
Has nothing to do with extracting page_source
Reply
#13
I am not grasping the line of your code.
I assume browser is a variable that is set to your webdriver like mine is FD_Web?
What is 'self' in this? And what is mapurl?
Sorry for the newbie question, just getting a handle on python. I tried googling it and came up with nothing.

browser.get(self.mapurl)
Reply
#14
i's line 24 of your last post: https://python-forum.io/Thread-Web-Scrap...#pid108462
FD_web.get(FD_web.mapurl)
Reply
#15
(Mar-26-2020, 03:44 AM)Larz60+ Wrote: I often use selenium to expand all of the JavaScript, then switch to Beautifulsoup
then you can use find, or find_all.
example:
        browser.get(self.mapurl)
        time.sleep(2)
        source = browser.page_source
        soup = BeautifulSoup(source, 'lxml')

I realize my error, I just do not understand this post. I was wondering if you could break down this code for me. In particular the first line of code.
Reply
#16
you should just use this part (I'm a bit confused about how you manipulate your pages)
soup = BeautifulSoup(DK_Content_FT, 'lxml')
# and/or
soup1 = BeautifulSoup(DK_Content_HT, lxml')
Reply
#17
Ok, I took a couple days off from looking at this to do some coding with logical statements comparing the data once I have it downloaded from the sites. Now revisiting this. I realized my error and installed lxml. I assume you just use that instead of html.parser. I will research the benefits of one over the other. Is there any method to just expand all html data without sending click commands to the websites. So take this page as an example:
https://nj.unibet.com/sports/#filter/foo...1006230344

If you click on "FULL TIME", "HALF", "ASIAN LINES", etc. The HTML page gets populated with additional data for everything in the list for that game. I thought you were originally saying that the lxml would do this, but that does not appear to be the case when I downloaded that page using lxml and writing it to a text document.

If you inspect the page that I put the link up. The elements that I am trying to expand the code for are all using this:

<li class="KambiBC-bet-offer-category KambiBC-collapsible-container">
<header class="KambiBC-bet-offer-category__header" data-touch-feedback="true">
<h2 class="KambiBC-bet-offer-category__title">Full Time&nbsp;</h2>
<div class="KambiBC-header-meta-wrapper"><div class="KambiBC-bet-offer-category__bet-offer-count">12</div></div>
</header>
</li>

So basically trying to automatically expand this element:
<li class="KambiBC-bet-offer-category KambiBC-collapsible-container">
Reply
#18
I saw this old post and was looking to do same thing. Was wondering if you were able ever to figure this out for scraping the live in play odd spreads
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscrapping sport betting websites KoinKoin 3 5,336 Nov-08-2023, 03:00 PM
Last Post: LoriBrown
Thumbs Up Issue facing while scraping the data from different websites in single script. Balamani 1 2,074 Oct-20-2020, 09:56 AM
Last Post: Larz60+
  Can urlopen be blocked by websites? peterjv26 2 3,320 Jul-26-2020, 06:45 PM
Last Post: peterjv26
  Python program to write into websites for you pythonDEV333 3 2,448 Jun-08-2020, 12:06 PM
Last Post: pythonDEV333
  Scraping Websites to post on Telegram kobryan 1 2,591 Oct-19-2019, 07:03 AM
Last Post: metulburr
  Scraping Websites to post on Telegram kobryan 0 3,394 Oct-09-2019, 04:11 PM
Last Post: kobryan
  Scrapping .aspx websites boxingowl88 3 8,142 Oct-10-2018, 05:35 PM
Last Post: stranac
  Scrapper for websites stinger 0 2,337 Jul-20-2018, 02:11 AM
Last Post: stinger
  scraping javascript websites with selenium DoctorEvil 1 3,314 Jun-08-2018, 06:40 PM
Last Post: DoctorEvil
  Email extraction from websites stefanoste78 14 11,978 Aug-18-2017, 09:44 PM
Last Post: stefanoste78

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020