Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Python Selenium WebDriver Problem
#1
Hello,

I'm currently trying to extract some tables from a particular website, but I have to use the Selenium WebDriver to do so because I believe the page uses Javascript.

I thought I had found the solution, but the code works sporadically.

Sometimes it'll work no problem, other times it will time out after not being able to find the element id. (Even though I can see and inspect it in the browser)

http://www.basketball-reference.com/boxs...80CHO.html

from selenium import webdriver

from selenium.webdriver.common.by import By
import selenium.webdriver.support.ui as ui
import selenium.webdriver.support.expected_conditions as EC
import os

options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--ignore-ssl-errors')
dir_path = os.path.dirname(os.path.realpath(__file__))
chromedriver = dir_path + "/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chrome_options=options, executable_path=chromedriver)

url = 'http://www.basketball-reference.com/boxscores/201703280CHO.html'

driver.get(url)

ui.WebDriverWait(driver, 15).until(EC.visibility_of_element_located((By.ID, "line_score")))

find_table = driver.find_element_by_xpath("//table[@id='line_score']")
Can someone please help me find a concrete way to extract the "line_score" and "four_factors" tables?

I'm only about a week old in Python coding, but I've done a fair bit of research and can't seem to find a solution. From what I've read thus far, certain pages have characteristics where they're constantly re-loading when values change, and so that gets in the way of the element grabbing. Is this what's going on?

Thank you for your time.

http://www.basketball-reference.com/boxs...80CHO.html

It wouldn't let me post the url on my first post, so here it is

Moderator Larz60+: Added Python tags. Please do this in the future (see help, BBCODE)
Reply
#2
(Apr-01-2017, 03:19 PM)SlpnGnt Wrote: Sometimes it'll work no problem, other times it will time out after not being able to find the element id. (Even though I can see and inspect it in the browser)
I would try to put a time.sleep(1) after you load the page to give it time to actually load the page. Depending on your internet speed at that point in time, if the program is too fast and for some reason your bandwidth is slow (example like friday nights when everyone is online taking bandwidth), the program will actually jump ahead before the data is loaded, and not find the data. If you still have a problem wtih a delay of 1 second i would increase it. I have some where the delay has to be 3 seconds to load the page for sure. 

from selenium import webdriver
 
from selenium.webdriver.common.by import By
import selenium.webdriver.support.ui as ui
import selenium.webdriver.support.expected_conditions as EC
import os
import time
 
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--ignore-ssl-errors')
dir_path = os.path.dirname(os.path.realpath(__file__))
chromedriver = dir_path + "/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chrome_options=options, executable_path=chromedriver)
 
url = 'http://www.basketball-reference.com/boxscores/201703280CHO.html'
 
driver.get(url)
 
ui.WebDriverWait(driver, 15).until(EC.visibility_of_element_located((By.ID, "line_score")))

time.sleep(1)

line_score_table = driver.find_element_by_xpath("//div[@id='all_line_score']")
print(line_score_table.text)

print('\n')

four_factors_table = driver.find_element_by_xpath("//div[@id='all_four_factors']")
print(four_factors_table.text)
Recommended Tutorials:
Reply
#3
Thanks for reply metulburr..

I actually tried using the time.sleep method but I was placing it right after the driver.get(url).

I just now put it after the WebDriverWait, after seeing your reply, and it's working so far, which is surprising and confusing to me...

I figured it was hanging at or before the WebDriverWait, not after it.

Either way, thank you for your help.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Need help for script access via webdriver to an open web page in Firefox Clixmaster 1 1,267 Apr-20-2023, 05:27 PM
Last Post: farshid
  Problem with Selenium webdriver Fred 1 2,051 Jan-10-2022, 05:45 PM
Last Post: Larz60+
  How do I iterate over an array and perform actions using selenium chrome webdriver? master 0 2,443 Sep-14-2020, 05:28 AM
Last Post: master
  Which webdriver is required for selenium in Pydroid App Rahatt 1 6,337 Jul-31-2020, 01:39 AM
Last Post: Larz60+
  Log In Button Won't Click - Python Selenium Webdriver samlee916 2 3,834 Jun-07-2020, 04:42 PM
Last Post: samlee916
  Hyperlink Click is not working in Selenium webdriver rajeev1729 0 2,032 May-02-2020, 11:21 AM
Last Post: rajeev1729
  Selenium webdriver error WiPi 4 12,150 Feb-09-2020, 11:38 AM
Last Post: WiPi
  Can not point to Selenium Webdriver path for Python Jupyter Notebook on Azure dadadance 4 10,094 Jul-31-2019, 10:00 PM
Last Post: perfringo
  Selenium Webdriver Memory Problem? satbir129 2 6,896 Mar-01-2019, 04:17 AM
Last Post: satbir129
  Selenium Webdriver Automate gahhon 1 2,876 Feb-19-2019, 04:59 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020