Python Forum

Full Version: Webscparing data within tags
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi I wrote the following code to extract property details.

At the moment I am trying to extract the area.

import requests
from bs4 import BeautifulSoup

#Loads the webpage
r = requests.get("https://www.century21.com/for-sale-homes/Westport-CT-20647c", headers={'User-agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0'})
#grabs the contect of this page
c=r.content

if "blocked" in r.text:
    print ("we've been blocked")



#makes the content more readable
soup=BeautifulSoup(c,"html.parser")

#Prints out the content 
#print(soup.prettify())

#Finds the number of proterty Listed
all=soup.find_all("div", {"class":"sr-card js-safe-link"})

#Finds the price of the first property
x=all[0]

for li in x.find_all("li"):
    print(li)
By the executing the code above i get the following printout

<li class="test-beds">6 beds</li>
<li class="test-baths">9 baths</li>
<li>8,511 sq ft</li>
<li>$370 / sq ft</li>
<li>On Site 2 days</li>
<li>Single Family Residence</li>
My question is how do I extract the data "8,511 sq ft"

I tried
print(li[2])
but unfortunately it did not work.

Can someone please point me in the right direction.

Thanks