Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Webscparing data within tags
#1
Hi I wrote the following code to extract property details.

At the moment I am trying to extract the area.

import requests
from bs4 import BeautifulSoup

#Loads the webpage
r = requests.get("https://www.century21.com/for-sale-homes/Westport-CT-20647c", headers={'User-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0'})
#grabs the contect of this page
c=r.content

if "blocked" in r.text:
    print ("we've been blocked")



#makes the content more readable
soup=BeautifulSoup(c,"html.parser")

#Prints out the content 
#print(soup.prettify())

#Finds the number of proterty Listed
all=soup.find_all("div", {"class":"sr-card js-safe-link"})

#Finds the price of the first property
x=all[0]

for li in x.find_all("li"):
    print(li)
By the executing the code above i get the following printout

<li class="test-beds">6 beds</li>
<li class="test-baths">9 baths</li>
<li>8,511 sq ft</li>
<li>$370 / sq ft</li>
<li>On Site 2 days</li>
<li>Single Family Residence</li>
My question is how do I extract the data "8,511 sq ft"

I tried
print(li[2])
but unfortunately it did not work.

Can someone please point me in the right direction.

Thanks
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Any way to remove HTML tags from scraped data? (I want text only) SeBz2020uk 1 3,412 Nov-02-2020, 08:12 PM
Last Post: Larz60+

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020