Bottom Page

Thread Rating:
  • 3 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Web-scraping part-2
I think this should be included into the tutorial pertaining to selenium. At least mentioning....

proper waiting instead of using time.sleep
from import By
from import WebDriverWait
from import expected_conditions as EC

WebDriverWait(browser, 3).until(EC.presence_of_element_located((By.ID, 'global-new-tweet-button')))
That this will wait for the presence of the element with the ID of "global-new-tweet-button". It will timeout after 3 seconds of not finding it.


You can find the definition of each expected support condition here.

more info:

a common request to perform key combos
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys

Where in this specific example in Firefox will execute Ctrl+S to bring up the save as menu.

switching or opening tabs
# Opens a new tab

# Switch to the newly opened tab

# Navigate to new URL in new tab
# Run other commands in the new tab here

You're then able to close the original tab as follows

# Switch to original tab

# Close original tab

# Switch back to newly opened tab, which is now in position 0

Or close the newly opened tab

# Close current tab

# Switch back to original tab
scrolling to the bottom of the page regardless of length
This in the cases where pages do not load the entire page until you scroll such as facebook. This will scroll to the bottom of the page, let it wait to load the rest (via time.sleep be aware), and keep repeating until it is at the bottom.
def scroll_to_bottom(driver):
    #driver = self.browser
    # Get scroll height
    last_height = driver.execute_script("return document.body.scrollHeight")

    while True:
        # Scroll down to bottom
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

        # Wait to load page

        # Calculate new scroll height and compare with last scroll height
        new_height = driver.execute_script("return document.body.scrollHeight")
        if new_height == last_height:
        last_height = new_height

#call scroll_to_bottom(browser) when you want it to scroll to the bottom of the page
Handle exceptions with built-in's:
>>> import selenium.common.exceptions as EX
>>> help(EX)
builtins.Exception(builtins.BaseException) WebDriverException ErrorInResponseException ImeActivationFailedException ImeNotAvailableException InvalidArgumentException InvalidCookieDomainException InvalidElementStateException ElementNotInteractableException ElementNotSelectableException ElementNotVisibleException InvalidSwitchToTargetException NoSuchFrameException NoSuchWindowException MoveTargetOutOfBoundsException NoAlertPresentException NoSuchAttributeException NoSuchElementException InvalidSelectorException RemoteDriverServerException StaleElementReferenceException TimeoutException UnableToSetCookieException UnexpectedAlertPresentException UnexpectedTagNameException
An easy way to test if Javacript is blocking you in the first place is to turn off javascript on your browser and reload the website. If what you are parsing is missing, then its a quick way to determine it is generated by javascript...requiring selenium. Another way is to check the javascript source code on the website regarding the element you are parsing. If there is a javascript call in the header, then you will need selenium to parse it.
snippsat likes this post

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  Web-Scraping part-1 snippsat 2 16,623 Jun-08-2017, 10:55 PM
Last Post: snippsat

Forum Jump:

Users browsing this thread: 1 Guest(s)