Webscraping BeautifulSoup - Insecam

blckpstv · Feb-06-2017, 08:06 PM

Hey,

I'm having troubles getting something to work from the insecam.org site. I am able to use the beautifulsoup examples.
But when I try it with Insecam.org, I get 403 errors, and then tried something with agent-headers (which I just understand the concept of, but not how to use it)
Still nothing works.

Is there somebody that has some tips on how to do this. I want to take all the src='' '' in al the a='' '' from all the viewsin Japan.

Any help would be appriciated, because at the moment I'm stuck... Blush

***snippsat*** · Feb-06-2017, 08:34 PM

>>> import requests
>>> url_get = requests.get('http://insecam.org/')
>>> url_get.status_code
403
>>> 
>>> user_agent = {'User-agent': 'Mozilla/5.0'}
>>> url_get = requests.get('http://insecam.org/', headers=user_agent)
>>> url_get.status_code
200

blckpstv · (This post was last modified: Feb-06-2017, 11:41 PM by snippsat.)

Ok no more 403 errors! Thanks.

Next what I'm trying to do is get this id=image0 which is the only returning object.
With this from this example kochi coders

 from bs4 import BeautifulSoup
import requests
 
user_agent = {'User-agent': 'Mozilla/5.0'}
url_get = requests.get('http://insecam.org/', headers=user_agent)

soup = BeautifulSoup(page.read())
nofollow = soup.find_all('a',id_='image0')
for all image0 in nofollow:
print(nofollow['src']+","+nofollow.string)

Errors when using the class = img-responsive img-rounded detailimage =

Error: p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff} span.s1 {font-variant-ligatures: no-common-ligatures} 
File "<stdin>", line 1
    for all img-responsive img-rounded detailimage in nofollow:
              ^
SyntaxError: invalid syntax

When using the id object as the example above I get following error.

Error: p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff} span.s1 {font-variant-ligatures: no-common-ligatures} 
for all image0 in nofollow:
  File "<stdin>", line 1
    for all image0 in nofollow:
                 ^
SyntaxError: invalid syntax

I find it hard to get comprehensive beginner examples for this, the documentation is a little overwhelming at the moment.

***snippsat*** · (This post was last modified: Feb-06-2017, 11:53 PM by snippsat.)

Look at BBcode help,i fixed it now.
Post code with correct indentation.

You are making a basic error in the loop.
Can not be two values.
Eg:

nofollow = ['pic1', 'pic2']
for image in nofollow:
    print(image)

Output:pic1
pic2

It can be two values but the you most use enumerate()

nofollow = ['pic1', 'pic2']
for number,image in enumerate(nofollow, 1):
    print('{} --> {}'.format(number, image))

Output:1 --> pic1
2 --> pic2

***Ofnuts*** · Feb-07-2017, 07:49 AM

Not:

for all image0 in ...

Not:

for image0 in ...

(ie, no all)

wavic · Feb-07-2017, 09:40 AM

Hello! There is no page Response object in your code. Iven if you was using the real one from the code - url_get - this object has no method read. Instead of read() ( which is represented in the urllib module ), you can get the page from the Response object with text or content: url_get.text, url_get.content

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Webscraping news articles by using selenium	cate16	7	3,147	Aug-28-2023, 09:58 AM Last Post: snippsat
	Webscraping with beautifulsoup	cormanstan	3	1,982	Aug-24-2023, 11:57 AM Last Post: snippsat
	Webscraping returning empty table	Buuuwq	0	1,402	Dec-09-2022, 10:41 AM Last Post: Buuuwq
	WebScraping using Selenium library	Korgik	0	1,048	Dec-09-2022, 09:51 AM Last Post: Korgik
	How to get rid of numerical tokens in output (webscraping issue)?	jps2020	0	1,953	Oct-26-2020, 05:37 PM Last Post: jps2020
	Python Webscraping with a Login Website	warriordazza	0	2,609	Jun-07-2020, 07:04 AM Last Post: warriordazza
	Help with basic webscraping	Captain_Snuggle	2	3,938	Nov-07-2019, 08:07 PM Last Post: kozaizsvemira
	Can't Resolve Webscraping AttributeError	Hass	1	2,315	Jan-15-2019, 09:36 PM Last Post: nilamo
	How to exclude certain links while webscraping basis on keywords	Prince_Bhatia	0	3,247	Oct-31-2018, 07:00 AM Last Post: Prince_Bhatia
	Webscraping homework	Ghigo1995	1	2,651	Sep-23-2018, 07:36 PM Last Post: nilamo

Webscraping BeautifulSoup - Insecam

User Panel Messages

Announcements