Python Forum
Thread Rating:
  • 1 Vote(s) - 3 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Web Crawler help
#23
Im not really sure what your current code is. Often if you are obtaining sub-url its best to clean the code up to not get confused.
Quote:In the link with the houses sold " http://www.funda.nl/nl/koop/verkocht/rotterdam/p1 " is each property listed under a different name 

(even/uneven and the name of the real estate broker (often "nvm" but can also be another one)
There are a couple ways. The root search for that is going to be 
ul = soup.find('ul', {'class':'object-list'}) 
Now you can list ul's li's via ul.find_all('li') and just go through the list of li tags. Or if you need to you can go through them one by one via find_next_sibling() to get the next li tag such as 

from bs4 import BeautifulSoup
import requests

url = 'http://www.funda.nl/koop/verkocht/rotterdam/p1/'


req = requests.get(url)
soup = BeautifulSoup(req.text, 'html.parser')
ul = soup.find('ul', {'class':'object-list'}) 
print(ul.li) #first li ;even nvm sold class
li2 = ul.li.find_next_sibling()
print(li2)  #second li; odd nvm sold class
ul.li is pretty much ul.find('li')

so if you did li2.find_next_sibling().find_next_sibling() it would actually be the ad class. OR if you did find_all it would be the 4th index of li tags

EDIT:
if you just wanted to get a list of li tags with *sold* then you can use regex
li_sold = ul.find_all('li',class_=re.compile('sold'))
Here this will grab everything except the ad class one.

'sold' would have to be the keyword as everything else changes. (if your trying to get them all). IF your trying to get only even classes then swap sold for even in regex
Recommended Tutorials:
Reply


Messages In This Thread
Web Crawler help - by takaa - Feb-06-2017, 06:57 PM
RE: Web Crawler help - by wavic - Feb-06-2017, 08:53 PM
RE: Web Crawler help - by metulburr - Feb-06-2017, 08:57 PM
RE: Web Crawler help - by takaa - Feb-07-2017, 08:46 AM
RE: Web Crawler help - by wavic - Feb-07-2017, 09:46 AM
RE: Web Crawler help - by takaa - Feb-07-2017, 05:17 PM
RE: Web Crawler help - by snippsat - Feb-07-2017, 05:45 PM
RE: Web Crawler help - by metulburr - Feb-07-2017, 05:53 PM
RE: Web Crawler help - by takaa - Feb-07-2017, 10:12 PM
RE: Web Crawler help - by metulburr - Feb-08-2017, 02:33 AM
RE: Web Crawler help - by takaa - Feb-08-2017, 12:22 PM
RE: Web Crawler help - by takaa - Feb-08-2017, 01:31 PM
RE: Web Crawler help - by wavic - Feb-08-2017, 01:47 PM
RE: Web Crawler help - by snippsat - Feb-08-2017, 02:19 PM
RE: Web Crawler help - by takaa - Feb-09-2017, 11:16 AM
RE: Web Crawler help - by metulburr - Feb-09-2017, 12:07 PM
RE: Web Crawler help - by takaa - Feb-09-2017, 12:08 PM
RE: Web Crawler help - by Larz60+ - Feb-09-2017, 12:10 PM
RE: Web Crawler help - by metulburr - Feb-09-2017, 12:14 PM
RE: Web Crawler help - by takaa - Feb-10-2017, 12:24 PM
RE: Web Crawler help - by metulburr - Feb-10-2017, 01:06 PM
RE: Web Crawler help - by takaa - Feb-14-2017, 01:49 PM
RE: Web Crawler help - by metulburr - Feb-14-2017, 02:43 PM
RE: Web Crawler help - by takaa - Feb-14-2017, 02:54 PM
RE: Web Crawler help - by takaa - Feb-15-2017, 11:02 AM
RE: Web Crawler help - by metulburr - Feb-15-2017, 01:18 PM
RE: Web Crawler help - by takaa - Feb-15-2017, 01:46 PM
RE: Web Crawler help - by snippsat - Feb-15-2017, 03:48 PM
RE: Web Crawler help - by takaa - Feb-15-2017, 04:01 PM
RE: Web Crawler help - by metulburr - Feb-15-2017, 06:03 PM
RE: Web Crawler help - by takaa - Feb-20-2017, 03:10 PM
RE: Web Crawler help - by metulburr - Feb-20-2017, 05:52 PM
RE: Web Crawler help - by takaa - Feb-20-2017, 07:56 PM
RE: Web Crawler help - by metulburr - Feb-21-2017, 02:18 AM
RE: Web Crawler help - by takaa - Mar-04-2017, 07:42 PM
RE: Web Crawler help - by metulburr - Mar-05-2017, 01:12 AM
RE: Web Crawler help - by Stoss - Jan-28-2019, 12:39 PM
RE: Web Crawler help - by takaa - Jan-30-2019, 08:35 AM
RE: Web Crawler help - by metulburr - Jan-30-2019, 06:23 PM
RE: Web Crawler help - by stateitreal - Apr-26-2019, 12:14 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Web Crawler help Mr_Mafia 2 2,047 Apr-04-2020, 07:20 PM
Last Post: Mr_Mafia

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020