Python Forum
Webscraping BeautifulSoup - Insecam
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Webscraping BeautifulSoup - Insecam
#1
Hey,

I'm having troubles getting something to work from the insecam.org site. I am able to use the beautifulsoup examples. 
But when I try it with Insecam.org, I get 403 errors, and then tried something with agent-headers (which I just understand the concept of, but not how to use it) 
Still nothing works. 

Is there somebody that has some tips on how to do this. I want to take all the src='' '' in al the a='' '' from all the viewsin Japan

Any help would be appriciated, because at the moment I'm stuck...  Blush
Reply
#2
>>> import requests
>>> url_get = requests.get('http://insecam.org/')
>>> url_get.status_code
403
>>> 
>>> user_agent = {'User-agent': 'Mozilla/5.0'}
>>> url_get = requests.get('http://insecam.org/', headers=user_agent)
>>> url_get.status_code
200
Reply
#3
Ok no more 403 errors! Thanks.

Next what I'm trying to do is get this id=image0 which is the only returning object.
With this from this example kochi coders

 from bs4 import BeautifulSoup
import requests
 
user_agent = {'User-agent': 'Mozilla/5.0'}
url_get = requests.get('http://insecam.org/', headers=user_agent)

soup = BeautifulSoup(page.read())
nofollow = soup.find_all('a',id_='image0')
for all image0 in nofollow:
print(nofollow['src']+","+nofollow.string)
Errors when using the class = img-responsive img-rounded detailimage = 

Error:
 p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff} span.s1 {font-variant-ligatures: no-common-ligatures}  File "<stdin>", line 1     for all img-responsive img-rounded detailimage in nofollow:               ^ SyntaxError: invalid syntax
When using the id object as the example above I get following error. 

Error:
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff} span.s1 {font-variant-ligatures: no-common-ligatures} for all image0 in nofollow:   File "<stdin>", line 1     for all image0 in nofollow:                  ^ SyntaxError: invalid syntax
I find it hard to get comprehensive beginner examples for this, the documentation is a little overwhelming at the moment.
Reply
#4
Look at BBcode help,i fixed it now.
Post code with correct indentation. 

You are making a basic error in the loop.
Can not be two values.
Eg:
nofollow = ['pic1', 'pic2']
for image in nofollow:
    print(image) 
Output:
pic1 pic2
It can be two values but the you most use enumerate()
nofollow = ['pic1', 'pic2']
for number,image in enumerate(nofollow, 1):
    print('{} --> {}'.format(number, image))
Output:
1 --> pic1 2 --> pic2
Reply
#5
Not:
for all image0 in ...
Not:
for image0 in ...
(ie, no all)
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Reply
#6
Hello! There is no page Response object in your code. Iven if you was using the real one from the code - url_get - this object has no method read. Instead of read() ( which is represented in the urllib module ), you can get the page from the Response object with text or content: url_get.text, url_get.content
"As they say in Mexico 'dosvidaniya'. That makes two vidaniyas."
https://freedns.afraid.org
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscraping news articles by using selenium cate16 7 2,878 Aug-28-2023, 09:58 AM
Last Post: snippsat
  Webscraping with beautifulsoup cormanstan 3 1,796 Aug-24-2023, 11:57 AM
Last Post: snippsat
  Webscraping returning empty table Buuuwq 0 1,332 Dec-09-2022, 10:41 AM
Last Post: Buuuwq
  WebScraping using Selenium library Korgik 0 1,010 Dec-09-2022, 09:51 AM
Last Post: Korgik
  How to get rid of numerical tokens in output (webscraping issue)? jps2020 0 1,900 Oct-26-2020, 05:37 PM
Last Post: jps2020
  Python Webscraping with a Login Website warriordazza 0 2,558 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Help with basic webscraping Captain_Snuggle 2 3,855 Nov-07-2019, 08:07 PM
Last Post: kozaizsvemira
  Can't Resolve Webscraping AttributeError Hass 1 2,250 Jan-15-2019, 09:36 PM
Last Post: nilamo
  How to exclude certain links while webscraping basis on keywords Prince_Bhatia 0 3,182 Oct-31-2018, 07:00 AM
Last Post: Prince_Bhatia
  Webscraping homework Ghigo1995 1 2,594 Sep-23-2018, 07:36 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020