Feb-07-2017, 10:12 PM
Thanks a million!
I have added a new def that searches through the pages of the individual listings:
the output for each listing is as follows (only copied the result for the first 3 listings).
<div class="breadcrumb">
<ol class="container breadcrumb-list">
<li class="breadcrumb-listitem">
<a href="/koop/" title="Home">Home</a>
<span class="icon-arrow-right-grey"></span>
</li>
<li class="breadcrumb-listitem">
<a href="/koop/rotterdam/" title="Rotterdam">Rotterdam</a>
<span class="icon-arrow-right-grey"></span>
</li>
<li class="breadcrumb-listitem">
<a href="/koop/rotterdam/lombardijen/" title="Lombardijen">Lombardijen</a>
<span class="icon-arrow-right-grey"></span>
</li>
<li class="breadcrumb-listitem">
<span title="Scottstraat 3">Scottstraat 3</span>
</li>
</ol>
Hope this part of my puzzle can also be solved. Tnx again for the help
I have added a new def that searches through the pages of the individual listings:
def get_single_item_data(item_url): source_code = requests.get(item_url) plain_text = source_code.text soup = BeautifulSoup(plain_text, 'html.parser') for item in soup.findAll('li', {'class': 'breadcrumb-listitem'} ): area = item.find('a', ) print(area)I try to get one specific piece of information (neighborhood). I succeeded in getting the info I am looking for in the output, but with a lot of stuff that I don't want. (i only want the "title" (in the first example " Lombardijen")
the output for each listing is as follows (only copied the result for the first 3 listings).
Output:<a href="/koop/" title="Home">Home</a>
<a href="/koop/rotterdam/" title="Rotterdam">Rotterdam</a>
<a href="/koop/rotterdam/lombardijen/" title="[color=#333333]Lombardijen[/color]">Lombardijen</a>
None
<a href="/koop/" title="Home">Home</a>
<a href="/koop/rotterdam/" title="Rotterdam">Rotterdam</a>
<a href="/koop/rotterdam/s-gravenland/" title="'s-Gravenland">'s-Gravenland</a>
None
<a href="/koop/" title="Home">Home</a>
<a href="/koop/rotterdam/" title="Rotterdam">Rotterdam</a>
<a href="/koop/rotterdam/pendrecht/" title="Pendrecht">Pendrecht</a>
None
The HTML code for each listing is looking as follows, I have made red what I tried to extract.<div class="breadcrumb">
<ol class="container breadcrumb-list">
<li class="breadcrumb-listitem">
<a href="/koop/" title="Home">Home</a>
<span class="icon-arrow-right-grey"></span>
</li>
<li class="breadcrumb-listitem">
<a href="/koop/rotterdam/" title="Rotterdam">Rotterdam</a>
<span class="icon-arrow-right-grey"></span>
</li>
<li class="breadcrumb-listitem">
<a href="/koop/rotterdam/lombardijen/" title="Lombardijen">Lombardijen</a>
<span class="icon-arrow-right-grey"></span>
</li>
<li class="breadcrumb-listitem">
<span title="Scottstraat 3">Scottstraat 3</span>
</li>
</ol>
Hope this part of my puzzle can also be solved. Tnx again for the help