Python Forum

Full Version: Made a very simple email grabber(scraper)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
What I need to know what this error is, and how to fix it.

I have problems installing in thing into python, so I don't know is selenium install right.
I am new so my code maybe bad too.
Here is the code:

from selenium import webdriver
import re

driver = webdriver.Chrome()
driver.get('http://www.networksecuritybybluedog.com/')


doc = driver.page_source

emails = re.findall(r'[\w\.-]+@[\w\.-]+', doc)

for email in emails:
    print(email)
You can see their not much to the code,just making the request downloading the source tags and assigning then to the var doc
then using re for paring the var doc. The for loop to print out all the emails

I hope I did it right. Think
Here is the error:

Traceback (most recent call last):
  File "C:/Users/renny and kite/Desktop/email_scraper.py", line 4, in <module>
    driver = webdriver.Chrome()
  File "C:\Python27\lib\site-packages\selenium-3.0.1-py2.7.egg\selenium\webdriver\chrome\webdriver.py", line 62, in __init__
    self.service.start()
  File "C:\Python27\lib\site-packages\selenium-3.0.1-py2.7.egg\selenium\webdriver\common\service.py", line 71, in start
    os.path.basename(self.path), self.start_error_message)
WebDriverException: Message: 'chromedriver' executable needs to be in PATH. Please see https://sites.google.com/a/chromium.org/...river/home

I went to the sight did not help me any.
I sound like it can't find chromedrive, I looked for it and could not find it Wall 

I hope someone can help, I would like to see if this works.
Thank you
Download driver.
Give path to chromedriver.exe,
if no path given chromedriver.exe need to be in folder of running script.
driver = webdriver.chrome(executable_path="C:/driver_folder/chromedriver.exe")
thank you,
I found out it was not installed, so I install and it worked fine.
Hello all,

Well I could not get crome to work. It said it was installed. Can not find it on the computer. will not work.
Change the plan. new email scraper build from the old one.
Here it is, works find, I just have to use what came with python, can get anything to installed and work. New program:

import urllib2
import requests
import re

address = input("Type in the website you want to scrap ")

response = urllib2.urlopen(address)
html = response.read()
text = html




emails = re.findall(r'[\w\.-]+@[\w\.-]+', text)

for email in emails:
    print(email)
Thank you for all the help Smile 
If you want to use this scraper have at it, their alot of emails out their.

What I like to know now is what is the best way to get my little bug to take all the links and look for more emails?
I want it to keep going as long as their new links to follow. How to do that, or point me some place where I can find the info.
having Big Grin  a great time learning this stuff.

Blue dog
Hello! 
Try BeautifulSoup.