Python Forum
How to Caputre Data After Selenium Scroll
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How to Caputre Data After Selenium Scroll
#3
(Jul-30-2019, 09:07 PM)metulburr Wrote: This is a better method to scroll to the bottom. It automatically identified the end of the page instead of arbitrary scrolling (and much faster too).
def scroll_to_bottom(driver):
    driver = self.browser
    SCROLL_PAUSE_TIME = 0.5
    # Get scroll height
    last_height = driver.execute_script("return document.body.scrollHeight")

    while True:
        # Scroll down to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        # Wait to load page
        time.sleep(SCROLL_PAUSE_TIME)

        # Calculate new scroll height and compare with last scroll height
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:
            break
        last_height = new_height
When you scroll down the page should load the html. But it can depend on the website. AFter you scroll to the bottom, you can then obtain driver.page_source

Hello, apologies for the delayed response. I got caught up with my day job for the last week & just made some time yesterday and today for updating you guys on this.

Before I begin let me thank you a ton for giving me a general guideline on how scrolling works on Selenium Python. The above code didn't exactly scroll for me but I figured it out using my below ammended script. I had to add
elm.send_Keys
because for some weird reason the page wasn't loading the data if I straight away went to the end of the page. So after every scroll to end I needed to scroll up and then scroll down again and finally loop it as you showed.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time


driver = webdriver.Firefox()
driver.get('https://www.espncricinfo.com/series/8039/commentary/65234/australia-vs-pakistan-final-icc-world-cup-1999?innings=1')

SCROLL_PAUSE_TIME = 2
elm = driver.find_element_by_tag_name('html')

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
    # Scroll down to almost the bottom of the page
    driver.execute_script("window.scrollTo(0, (document.body.scrollHeight-400));")

    # Time Taken to Load the page
    time.sleep(SCROLL_PAUSE_TIME)

    # Scrolling Up & Down to load more Data
    elm.send_keys(Keys.HOME)
    time.sleep(1)
    elm.send_keys(Keys.END)
    time.sleep(1)

    # Calculate the new scrolling height and then compare it to old height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        break
    last_height = new_height
So the first problem of my project is resolved, the page gets scrolled until the data stops to load which now leads us to problem two. Problem two requires me to get two things from this scrolled webpage:

1) every instance of the text ", no run,"
2) every player that incurred this value

Every instance of this text is in the format of "(player name), no run,". I need to store this data by player name and how many no runs he or she incur in this. I check the page source after it completely loads but unfortunately it does not show all the data for the webpage so most probably I would have to inspect element? & if so how do I go about it?

Regards
Waqas
Reply


Messages In This Thread
RE: How to Caputre Data After Selenium Scroll - by ahmedwaqas92 - Aug-05-2019, 02:08 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Selenium (Dropdown-) data Robin_at_Cantelli 2 6,223 Dec-29-2021, 03:16 AM
Last Post: ondreweil
  Scroll Down Post Instagram Likes rmiguelantunes74 0 3,178 Oct-05-2020, 07:38 PM
Last Post: rmiguelantunes74
  Extract data with Selenium and BeautifulSoup nestor 3 3,963 Jun-06-2020, 01:34 AM
Last Post: Larz60+
  Webelement scroll rove76 1 2,121 Mar-17-2020, 05:38 PM
Last Post: rove76
  Clicking on element not triggering event in Selenium Python (Event Key is not in data dkaeloredo 2 4,309 Feb-16-2020, 05:50 AM
Last Post: dkaeloredo
  Selenium get data from newly accessed page hoff1022 2 2,980 Oct-09-2019, 06:52 PM
Last Post: hoff1022
  Unable to access javaScript generated data with selenium and headless FireFox. pjn4 0 2,571 Aug-04-2019, 11:10 AM
Last Post: pjn4
  Can't get method to scroll down page. caarsonr 5 4,309 Jun-20-2019, 09:14 PM
Last Post: caarsonr
  Selenium Data Scrubbing - Need Some Help HalPlz 1 2,378 Feb-26-2018, 11:06 PM
Last Post: Larz60+
  Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program. AcszE 1 3,661 Nov-03-2017, 08:41 PM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020