Mar-04-2017, 07:42 PM
(Feb-21-2017, 02:18 AM)metulburr Wrote: I had a site that i kept having trouble with. And i just ended up using selenium to bring up the browser so i could manually enter the captcha, then allow my bot to do the automation of everything else. If you truly hit a road block with javascript or captchas, this will always work as a backup.
Your web crawler via BeuatifulSoup would be the same, its just grabbing the html with selenium instead of requests.
Since I dont need to run the script often it would be a perfect solution for me to just enter the captcha manually. I have tried to open the browser through selenium using the following code, but nothing happens.
import requests from selenium import webdriver from selenium.webdriver.common.keys import Keys from bs4 import BeautifulSoup open('output.csv', 'w').close() import re browser = webdriver.Firefox() def fundaSpider(max_pages): page = 1 while page <= max_pages: url = 'http://www.funda.nl/koop/rotterdam/p{}'.format(page) browser.get('url') source_code = selenium.get(url) plain_text = source_code.text soup = BeautifulSoup(plain_text, 'html.parser') ads = soup.find_all('li', {'class': 'search-result'}) for ad in ads:
Output:Process finished with exit code 0
I am having firefox developers edition installed. If you can help me in the right direction on how to use selenium that would be very helpful.