Aug-05-2019, 02:08 PM
(Jul-30-2019, 09:07 PM)metulburr Wrote: This is a better method to scroll to the bottom. It automatically identified the end of the page instead of arbitrary scrolling (and much faster too).
def scroll_to_bottom(driver): driver = self.browser SCROLL_PAUSE_TIME = 0.5 # Get scroll height last_height = driver.execute_script("return document.body.scrollHeight") while True: # Scroll down to bottom driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") # Wait to load page time.sleep(SCROLL_PAUSE_TIME) # Calculate new scroll height and compare with last scroll height new_height = driver.execute_script("return document.body.scrollHeight") if new_height == last_height: break last_height = new_heightWhen you scroll down the page should load the html. But it can depend on the website. AFter you scroll to the bottom, you can then obtaindriver.page_source
Hello, apologies for the delayed response. I got caught up with my day job for the last week & just made some time yesterday and today for updating you guys on this.
Before I begin let me thank you a ton for giving me a general guideline on how scrolling works on Selenium Python. The above code didn't exactly scroll for me but I figured it out using my below ammended script. I had to add
elm.send_Keysbecause for some weird reason the page wasn't loading the data if I straight away went to the end of the page. So after every scroll to end I needed to scroll up and then scroll down again and finally loop it as you showed.
from selenium import webdriver from selenium.webdriver.common.keys import Keys import time driver = webdriver.Firefox() driver.get('https://www.espncricinfo.com/series/8039/commentary/65234/australia-vs-pakistan-final-icc-world-cup-1999?innings=1') SCROLL_PAUSE_TIME = 2 elm = driver.find_element_by_tag_name('html') # Get scroll height last_height = driver.execute_script("return document.body.scrollHeight") while True: # Scroll down to almost the bottom of the page driver.execute_script("window.scrollTo(0, (document.body.scrollHeight-400));") # Time Taken to Load the page time.sleep(SCROLL_PAUSE_TIME) # Scrolling Up & Down to load more Data elm.send_keys(Keys.HOME) time.sleep(1) elm.send_keys(Keys.END) time.sleep(1) # Calculate the new scrolling height and then compare it to old height new_height = driver.execute_script("return document.body.scrollHeight") if new_height == last_height: break last_height = new_heightSo the first problem of my project is resolved, the page gets scrolled until the data stops to load which now leads us to problem two. Problem two requires me to get two things from this scrolled webpage:
1) every instance of the text ", no run,"
2) every player that incurred this value
Every instance of this text is in the format of "(player name), no run,". I need to store this data by player name and how many no runs he or she incur in this. I check the page source after it completely loads but unfortunately it does not show all the data for the webpage so most probably I would have to inspect element? & if so how do I go about it?
Regards
Waqas