Python Forum

Full Version: Need alittle hlpl with an image scraper.
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello one and all,
I am back with , I hope is a small problem. I found this image scraper that looks just what I been looking for. I been working on this one error for over a week now. I google it and I can't find anything that is close to what I have here.
Here is the code, I am using python 2.7.12 on wim 7:



import requests
import mechanize
from lxml import html
import sys
import urlparse
import time

def grab_my_pictures():
    respones = requests.get(final_link)
    parsed_body = html.fromstring(response.text)
    images = parsed_body.xpath('//img/@src')
    if not images:
        sys.exit("Found no images")
    images = [urlparse.urljoin(response.url, url)for url in images]
    print 'Found %s images' % len(images)

    for url in images[1:50]:
        r = requests.get(url)
        r = open('pics2/%s' % url.split('/')[-1], 'w')
        f.write(r.content)
        f.close()

website_name = raw_input("Please enter the name of the website ")
list_of_links = []
br = mechanize.browser()
br.set_handle_robot(False)
final_website = "http//" + website_name
print final_website
respons = br.open("http//" + website_name)  
Their still some code that I do not know what it does, I been taking it line by line learning what each line does.
The error is this:


Traceback (most recent call last):
  File "C:/Users/renny and kite/Desktop/play_with_image/test_1.py", line 25, in <module>
    br = mechanize.browser()
AttributeError: 'module' object has no attribute 'browser'



I just don't know what attribute I should put in this function br = mechanize.browser()

I been to a few sites, but can't find anything about mechanize.browser()
site one http://wwwsearch.sourceforge.net/mechanize/doc.html
site two  http://stockrt.github.io/p/emulating-a-b...mechanize/
site three http://www.pythonforbeginners.com/mechan...mechanize/, now I found this code[br = mechanize.browser()] on this site, but I could not get it to work.

So no luck so far, that is why I am here again. Hope someone can help, I would like to get this peace of code to work.
Thank you
>>> 'browser' == 'Browser'
False
ok does that replace this br = mechanize.browser() If not where do I put 'browser' == 'Browser' at?
thank you
I tried to give a hint about capitalize character is not the same as lowercase character.
>>> import mechanize
>>> br = mechanize.browser()
Traceback (most recent call last):
 File "<interactive input>", line 1, in <module>
AttributeError: 'module' object has no attribute 'browser'

>>> br = mechanize.Browser()
>>> 
Ok, I should have seen that.
now i get a lot of errors
Traceback (most recent call last):
File "C:\Users\renny and kite\Desktop\play_with_image\test_1.py", line 26, in <module>
br.set_handle_robot(False)
File "build\bdist.win-amd64\egg\mechanize\_mechanize.py", line 628, in __getattr__
".select_form()?)" % (self.__class__, name))
AttributeError: mechanize._mechanize.Browser instance has no attribute set_handle_robot (perhaps you forgot to .select_form()?)

I talk to the guy that said he wrote this code, he said that it will work just like it is. Yep, like most of the stuff I have found on the web it takes alot to make it work
thank you for your help
So nobody know why this this code does not work. I would think one person might know what is going on inside this code.

br = mechanize.Browser(what attribute should be here) to make this work?
thank you
So, as I can see mechanize is used just to retrieve the web page. If the page is not build with JavaScript you don't need mechanize. Use requests module instead. 
Or replace mechanize with selenium

Also, you never call grab_my_pictures() function. I don't see where requests.get(final_link) gets the web address from either.

Are you sure this is the full script?
Again typing error same as you had on previos line,
this is a little silly,you should catch this yourself.
just copy and paste the code,as it's not your code anyway Doh
>>> import mechanize
>>> br = mechanize.Browser()
>>> br.set_handle_robot(False)
Traceback (most recent call last):
 File "<interactive input>", line 1, in <module>
 File "C:\Python27\lib\site-packages\mechanize\_mechanize.py", line 628, in __getattr__
   ".select_form()?)" % (self.__class__, name))
AttributeError: mechanize._mechanize.Browser instance has no attribute set_handle_robot (perhaps you forgot to .select_form()?)
>>> # Add s
>>> br.set_handle_robots(False)
>>>
Thanks wavic, I will try that, snippsat i change br.set_handle_robots a while back, does not help still get the same error.
I will start to work on this later to day.
thank you all for all the help