Apr-01-2017, 03:19 PM
Hello,
I'm currently trying to extract some tables from a particular website, but I have to use the Selenium WebDriver to do so because I believe the page uses Javascript.
I thought I had found the solution, but the code works sporadically.
Sometimes it'll work no problem, other times it will time out after not being able to find the element id. (Even though I can see and inspect it in the browser)
http://www.basketball-reference.com/boxs...80CHO.html
I'm only about a week old in Python coding, but I've done a fair bit of research and can't seem to find a solution. From what I've read thus far, certain pages have characteristics where they're constantly re-loading when values change, and so that gets in the way of the element grabbing. Is this what's going on?
Thank you for your time.
http://www.basketball-reference.com/boxs...80CHO.html
It wouldn't let me post the url on my first post, so here it is
I'm currently trying to extract some tables from a particular website, but I have to use the Selenium WebDriver to do so because I believe the page uses Javascript.
I thought I had found the solution, but the code works sporadically.
Sometimes it'll work no problem, other times it will time out after not being able to find the element id. (Even though I can see and inspect it in the browser)
http://www.basketball-reference.com/boxs...80CHO.html
from selenium import webdriver from selenium.webdriver.common.by import By import selenium.webdriver.support.ui as ui import selenium.webdriver.support.expected_conditions as EC import os options = webdriver.ChromeOptions() options.add_argument('--ignore-certificate-errors') options.add_argument('--ignore-ssl-errors') dir_path = os.path.dirname(os.path.realpath(__file__)) chromedriver = dir_path + "/chromedriver" os.environ["webdriver.chrome.driver"] = chromedriver driver = webdriver.Chrome(chrome_options=options, executable_path=chromedriver) url = 'http://www.basketball-reference.com/boxscores/201703280CHO.html' driver.get(url) ui.WebDriverWait(driver, 15).until(EC.visibility_of_element_located((By.ID, "line_score"))) find_table = driver.find_element_by_xpath("//table[@id='line_score']")Can someone please help me find a concrete way to extract the "line_score" and "four_factors" tables?
I'm only about a week old in Python coding, but I've done a fair bit of research and can't seem to find a solution. From what I've read thus far, certain pages have characteristics where they're constantly re-loading when values change, and so that gets in the way of the element grabbing. Is this what's going on?
Thank you for your time.
http://www.basketball-reference.com/boxs...80CHO.html
It wouldn't let me post the url on my first post, so here it is
Moderator Larz60+: Added Python tags. Please do this in the future (see help, BBCODE)