Need alittle hlpl with an image scraper. - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: Need alittle hlpl with an image scraper. (/thread-1295.html) |
Need alittle hlpl with an image scraper. - Blue Dog - Dec-21-2016 Hello one and all, I am back with , I hope is a small problem. I found this image scraper that looks just what I been looking for. I been working on this one error for over a week now. I google it and I can't find anything that is close to what I have here. Here is the code, I am using python 2.7.12 on wim 7: import requests import mechanize from lxml import html import sys import urlparse import time def grab_my_pictures(): respones = requests.get(final_link) parsed_body = html.fromstring(response.text) images = parsed_body.xpath('//img/@src') if not images: sys.exit("Found no images") images = [urlparse.urljoin(response.url, url)for url in images] print 'Found %s images' % len(images) for url in images[1:50]: r = requests.get(url) r = open('pics2/%s' % url.split('/')[-1], 'w') f.write(r.content) f.close() website_name = raw_input("Please enter the name of the website ") list_of_links = [] br = mechanize.browser() br.set_handle_robot(False) final_website = "http//" + website_name print final_website respons = br.open("http//" + website_name)Their still some code that I do not know what it does, I been taking it line by line learning what each line does. The error is this: Traceback (most recent call last): File "C:/Users/renny and kite/Desktop/play_with_image/test_1.py", line 25, in <module> br = mechanize.browser() AttributeError: 'module' object has no attribute 'browser' I just don't know what attribute I should put in this function br = mechanize.browser() I been to a few sites, but can't find anything about mechanize.browser() site one http://wwwsearch.sourceforge.net/mechanize/doc.html site two http://stockrt.github.io/p/emulating-a-browser-in-python-with-mechanize/ site three http://www.pythonforbeginners.com/mechanize/browsing-in-python-with-mechanize/, now I found this code[br = mechanize.browser()] on this site, but I could not get it to work. So no luck so far, that is why I am here again. Hope someone can help, I would like to get this peace of code to work. Thank you RE: Need alittle hlpl with an image scraper. - snippsat - Dec-21-2016 >>> 'browser' == 'Browser' False RE: Need alittle hlpl with an image scraper. - Blue Dog - Dec-21-2016 ok does that replace this br = mechanize.browser() If not where do I put 'browser' == 'Browser' at? thank you RE: Need alittle hlpl with an image scraper. - snippsat - Dec-21-2016 I tried to give a hint about capitalize character is not the same as lowercase character. >>> import mechanize >>> br = mechanize.browser() Traceback (most recent call last): File "<interactive input>", line 1, in <module> AttributeError: 'module' object has no attribute 'browser' >>> br = mechanize.Browser() >>> RE: Need alittle hlpl with an image scraper. - Blue Dog - Dec-21-2016 Ok, I should have seen that. now i get a lot of errors Traceback (most recent call last): File "C:\Users\renny and kite\Desktop\play_with_image\test_1.py", line 26, in <module> br.set_handle_robot(False) File "build\bdist.win-amd64\egg\mechanize\_mechanize.py", line 628, in __getattr__ ".select_form()?)" % (self.__class__, name)) AttributeError: mechanize._mechanize.Browser instance has no attribute set_handle_robot (perhaps you forgot to .select_form()?) I talk to the guy that said he wrote this code, he said that it will work just like it is. Yep, like most of the stuff I have found on the web it takes alot to make it work thank you for your help RE: Need alittle hlpl with an image scraper. - Blue Dog - Dec-23-2016 So nobody know why this this code does not work. I would think one person might know what is going on inside this code. br = mechanize.Browser(what attribute should be here) to make this work? thank you RE: Need alittle hlpl with an image scraper. - wavic - Dec-23-2016 So, as I can see mechanize is used just to retrieve the web page. If the page is not build with JavaScript you don't need mechanize. Use requests module instead. Or replace mechanize with selenium Also, you never call grab_my_pictures() function. I don't see where requests.get(final_link) gets the web address from either.Are you sure this is the full script? RE: Need alittle hlpl with an image scraper. - snippsat - Dec-23-2016 Again typing error same as you had on previos line, this is a little silly,you should catch this yourself. just copy and paste the code,as it's not your code anyway >>> import mechanize >>> br = mechanize.Browser() >>> br.set_handle_robot(False) Traceback (most recent call last): File "<interactive input>", line 1, in <module> File "C:\Python27\lib\site-packages\mechanize\_mechanize.py", line 628, in __getattr__ ".select_form()?)" % (self.__class__, name)) AttributeError: mechanize._mechanize.Browser instance has no attribute set_handle_robot (perhaps you forgot to .select_form()?) >>> # Add s >>> br.set_handle_robots(False) >>> RE: Need alittle hlpl with an image scraper. - Blue Dog - Dec-24-2016 Thanks wavic, I will try that, snippsat i change br.set_handle_robots a while back, does not help still get the same error. I will start to work on this later to day. thank you all for all the help |