Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Web Scraping Sportsbook Websites
#11
Oh so then I would not need to send click commands? It would automatically expand all JavaScript on the page? That would help a lot since I could see a ton of bugs with sending click commands.

I tried by getting an error, not sure I am implementing it correctly:

import requests
import csv
from bs4 import BeautifulSoup
import urllib.request
import random
import re
from selenium import webdriver
import time

chrome_path = r"C:\Users\user\Desktop\chromedriver.exe"

Urls = []
Teams = ''

with open('M:\SportsBooks3.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:
        Urls.append(row)

FD_web = webdriver.Chrome(chrome_path)
FD_web.get(str(Urls[2])[2:-2])

# MapURL Test
FD_web.get(FD_web.mapurl)
time.sleep(2)
source = FD_web.page_source
soup = BeautifulSoup(source, 'lxml')
print(soup)
# Soup should be all expanded HTML
Error:

Error:
C:\Users\user\PycharmProjects\untitled\venv\Scripts\python.exe C:/Users/user/PycharmProjects/untitled/Test_MapURL.py Traceback (most recent call last): File "C:/Users/user/PycharmProjects/untitled/Test_MapURL.py", line 25, in <module> FD_web.get(FD_web.mapurl) AttributeError: 'WebDriver' object has no attribute 'mapurl' Process finished with exit code 1
Quote
#12
the error is on this FD_web.get(FD_web.mapurl)
doesn't know what mapurl is.
Has nothing to do with extracting page_source
Quote
#13
I am not grasping the line of your code.
I assume browser is a variable that is set to your webdriver like mine is FD_Web?
What is 'self' in this? And what is mapurl?
Sorry for the newbie question, just getting a handle on python. I tried googling it and came up with nothing.

browser.get(self.mapurl)
Quote
#14
i's line 24 of your last post: https://python-forum.io/Thread-Web-Scrap...#pid108462
FD_web.get(FD_web.mapurl)
Quote
#15
(Mar-26-2020, 03:44 AM)Larz60+ Wrote: I often use selenium to expand all of the JavaScript, then switch to Beautifulsoup
then you can use find, or find_all.
example:
        browser.get(self.mapurl)
        time.sleep(2)
        source = browser.page_source
        soup = BeautifulSoup(source, 'lxml')

I realize my error, I just do not understand this post. I was wondering if you could break down this code for me. In particular the first line of code.
Quote
#16
you should just use this part (I'm a bit confused about how you manipulate your pages)
soup = BeautifulSoup(DK_Content_FT, 'lxml')
# and/or
soup1 = BeautifulSoup(DK_Content_HT, lxml')
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Scraping Websites to post on Telegram kobryan 1 313 Oct-19-2019, 07:03 AM
Last Post: metulburr
  Scraping Websites to post on Telegram kobryan 0 342 Oct-09-2019, 04:11 PM
Last Post: kobryan
  Scrapping .aspx websites boxingowl88 3 1,980 Oct-10-2018, 05:35 PM
Last Post: stranac
  Scrapper for websites stinger 0 656 Jul-20-2018, 02:11 AM
Last Post: stinger
  scraping javascript websites with selenium DoctorEvil 1 1,224 Jun-08-2018, 06:40 PM
Last Post: DoctorEvil
  Email extraction from websites stefanoste78 14 4,874 Aug-18-2017, 09:44 PM
Last Post: stefanoste78
  Visiting websites and taking screenshots implicitly? bigmit37 4 1,948 May-01-2017, 04:26 PM
Last Post: bigmit37

Forum Jump:


Users browsing this thread: 1 Guest(s)