Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
How does thid work?
#1
Here is a email scraper:

#Ok, I know what line 0ne does:
import re
#Line 2 I know what is does:
import urllib2
#Line 3 res is a var = to urllib2.urlopen function
res=urllib2.urlopen('http://www.networksecuritybybluedog.com')
#line 4 a var text= res.read function this read the var res load it in to the var text I think.
txt = res.read()
#line 5 a var named email + re.findall function what it is looking for, in this case it is emails address.
emails=re.findall(r'[\w\.-]+@[\w\.-]+',txt)
#line 6 a for loop going through all the email it findes
for email in emails:
#line 7 this prints out the email address
    print "email"
I put commits in the code above of what I think each line of script does.
Could some who know what they are doing read the commits and let me know if I am right or wrong.
Thank You
Reply
#2
the last line i believe you mean
email
and not
"email"
You want to print the value of email, not the actual string "email"

res is actually an instance of urlopen()
https://docs.python.org/2/library/urllib...b2.urlopen

read is a method that reads X number of bytes, with no args is the entire thing
https://docs.python.org/2/library/urllib2.html#examples
Recommended Tutorials:
Reply
#3
   
Thank you, yes I see what I did wrong, that is why it print out the string email and not the address.
But now it does print the address
I am happy Smile
Reply


Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020