(Apr-30-2021, 02:49 PM)spacedog Wrote: but those use bueatifulsoup which is too slow.
Can use
lxml
as parser in BS,i always do that.
soup = BeautifulSoup(response.content, 'lxml')
For your task if find
ul
XPath under products then can iterate over all
li
which will be children of that tag.
from lxml import html
import requests
url = 'https://www.yellowpages.com/nationwide/mip/toxglobal-diagnostics-llc-556885209?lid=1002050417627'
resonse = requests.get(url)
tree = html.fromstring(resonse.content)
prod = tree.xpath('//*[@id="business-info"]/dl/dd[3]/ul')
for tag in prod[0].getchildren():
print(tag.text)
Output:
Walk In Clinic
DNA Paternity Testing
Genetic Testing
Wellness Programs
Senior Citizen Wellness
Blood Tests
Free Clinics
Health Coaches
Employee Health Programs
Toxicology Labs
Flu Shots