Python Forum
Thread Rating:
  • 1 Vote(s) - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Web Crawler help
#31
Since today I have an issue. The BeautifulSoup is not showing all the HTML code of the requested page anymore. 

If i print 
print(soup)
I am not getting all the code that I see when I "Inspect source code" of the web page. Before today this was the same and i had no problem running my code. If i now run the code this is the result:

import requests
from bs4 import BeautifulSoup
import re
 
def fundaSpider(max_pages):
    page = 1
    while page <= max_pages:
        url = 'http://www.funda.nl/koop/rotterdam/p{}'.format(page)
        source_code = requests.get(url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text, 'html.parser')
        ads = soup.find_all('li', {'class': 'search-result'})
 
        print(ads)
 
        page += 1
 
fundaSpider(1)
Output:
[]   Process finished with exit code 0
In my web browser I have no problem accessing the web page. Is it possible that the website is blocking the crawler, but not me as a person? Is there any way I can keep running the crawler? (just for the record, I use this crawler only for personal use and run it a few times per week).
Reply


Messages In This Thread
Web Crawler help - by takaa - Feb-06-2017, 06:57 PM
RE: Web Crawler help - by wavic - Feb-06-2017, 08:53 PM
RE: Web Crawler help - by metulburr - Feb-06-2017, 08:57 PM
RE: Web Crawler help - by takaa - Feb-07-2017, 08:46 AM
RE: Web Crawler help - by wavic - Feb-07-2017, 09:46 AM
RE: Web Crawler help - by takaa - Feb-07-2017, 05:17 PM
RE: Web Crawler help - by snippsat - Feb-07-2017, 05:45 PM
RE: Web Crawler help - by metulburr - Feb-07-2017, 05:53 PM
RE: Web Crawler help - by takaa - Feb-07-2017, 10:12 PM
RE: Web Crawler help - by metulburr - Feb-08-2017, 02:33 AM
RE: Web Crawler help - by takaa - Feb-08-2017, 12:22 PM
RE: Web Crawler help - by takaa - Feb-08-2017, 01:31 PM
RE: Web Crawler help - by wavic - Feb-08-2017, 01:47 PM
RE: Web Crawler help - by snippsat - Feb-08-2017, 02:19 PM
RE: Web Crawler help - by takaa - Feb-09-2017, 11:16 AM
RE: Web Crawler help - by metulburr - Feb-09-2017, 12:07 PM
RE: Web Crawler help - by takaa - Feb-09-2017, 12:08 PM
RE: Web Crawler help - by Larz60+ - Feb-09-2017, 12:10 PM
RE: Web Crawler help - by metulburr - Feb-09-2017, 12:14 PM
RE: Web Crawler help - by takaa - Feb-10-2017, 12:24 PM
RE: Web Crawler help - by metulburr - Feb-10-2017, 01:06 PM
RE: Web Crawler help - by takaa - Feb-14-2017, 01:49 PM
RE: Web Crawler help - by metulburr - Feb-14-2017, 02:43 PM
RE: Web Crawler help - by takaa - Feb-14-2017, 02:54 PM
RE: Web Crawler help - by takaa - Feb-15-2017, 11:02 AM
RE: Web Crawler help - by metulburr - Feb-15-2017, 01:18 PM
RE: Web Crawler help - by takaa - Feb-15-2017, 01:46 PM
RE: Web Crawler help - by snippsat - Feb-15-2017, 03:48 PM
RE: Web Crawler help - by takaa - Feb-15-2017, 04:01 PM
RE: Web Crawler help - by metulburr - Feb-15-2017, 06:03 PM
RE: Web Crawler help - by takaa - Feb-20-2017, 03:10 PM
RE: Web Crawler help - by metulburr - Feb-20-2017, 05:52 PM
RE: Web Crawler help - by takaa - Feb-20-2017, 07:56 PM
RE: Web Crawler help - by metulburr - Feb-21-2017, 02:18 AM
RE: Web Crawler help - by takaa - Mar-04-2017, 07:42 PM
RE: Web Crawler help - by metulburr - Mar-05-2017, 01:12 AM
RE: Web Crawler help - by Stoss - Jan-28-2019, 12:39 PM
RE: Web Crawler help - by takaa - Jan-30-2019, 08:35 AM
RE: Web Crawler help - by metulburr - Jan-30-2019, 06:23 PM
RE: Web Crawler help - by stateitreal - Apr-26-2019, 12:14 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  stucked with my crawler jviure 0 1,770 Apr-15-2020, 10:47 AM
Last Post: jviure
  Web Crawler help Mr_Mafia 2 2,669 Apr-04-2020, 07:20 PM
Last Post: Mr_Mafia
  Crawler Telegram sysx90 0 2,967 Nov-30-2019, 04:32 PM
Last Post: sysx90
  web crawler problems kid_with_polio 3 2,996 Jun-15-2019, 12:33 AM
Last Post: metulburr
  Web Crawler Not Working chrisdas 13 13,965 Feb-06-2017, 10:45 PM
Last Post: scriptso

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020