Python Forum
Selenium - bypass Cloudflare bot detection
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Selenium - bypass Cloudflare bot detection
#1
Hello,

Cloudflare detects my scraper and blocks access to the site.
I have tried to use selenium_stealth, this seems to pass bot detection at https://bot.sannysoft.com/ but not at Cloudflare.

Any advice please?

   

Here is my code:

import time
from selenium import webdriver
from selenium_stealth import stealth
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options=webdriver.ChromeOptions()

options.add_argument("start-maximized")
#options.add_argument("--headless")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument("--disable-blink-features=AutomationControlled")
driver = webdriver.Chrome(options=options)

stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )


driver.get('https://www.sanparks.org/reservations/accommodation/filters/parks/113/arrivalDate/2022-12-11/departureDate/2022-12-31/camps/0%7C116/types/0/features/0')
#driver.get('https://bot.sannysoft.com/')

time.sleep(10)

element = WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.CLASS_NAME, 'load-more')))

soup = BeautifulSoup(driver.page_source, 'html.parser')

print (soup.contents)

driver.quit()
Reply
#2
Ok seems I have solved the problem.

Use undetected_chromedriver
Reply
#3
I'm trying this right now, based on something I saw in another discussion on this group, but don't know the syntax for that last line. It doesn't like just "row" in the append. The fetchall is returning a tuple.
Reply
#4
I am also facing a similar situation 1v1 battle
Reply
#5
Use ScrapingBypass web scraping API, which can help users bypass Cloudflare easily.

Quote:import requests
url = "https://api.scrapingbypass.com/"
method = "GET"
headers = {
"x-cb-apikey": r"your api key",
"x-cb-host": r"www.sanparks.org",
}
response = requests.request(method, url, headers=headers)
print(response.text)
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  How to check HTTP error 500 and bypass SriMekala 3 10,610 May-04-2019, 02:07 PM
Last Post: snippsat
  selneium JS bypass metulburr 15 6,952 Nov-05-2018, 10:52 AM
Last Post: Larz60+
  Error in Selenium: CRITICAL:root:Selenium module is not installed...Exiting program. AcszE 1 3,649 Nov-03-2017, 08:41 PM
Last Post: metulburr
  selenium bypass javascript popup box metulburr 6 8,461 Jun-02-2017, 07:15 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020