Python Forum

Full Version: Helping out a friend - simple question
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

I'm scrapping a website and I don't know how to print only part for the text.

from bs4 import BeautifulSoup
import requests

source = requests.get('http://www.website.com').text

soup = BeautifulSoup(source, 'lxml')

title = soup.find('div', class_='pr_title')
print(title.text)

sku = soup.find('div', class_='pr_infos')
print(sku.text)
The textx that comes out is like this :

2008 xxxxx 7xxxxx



2 xxxx, xxx-xxx"
Sxxxxxx
ENGINE : Diesel
CAP: x,xxx lbs SKU: 1234

I just whant the ; " SKU: 1234 "

Thanks
You probably need to drill down on the div.
Change your code by commenting out lines 9 and 12
and add this line after line 12:
print(BeautifulSoup.prettify(sku))
And post results in output tags
thanks
how about this part?
Quote:And post results in output tags
sorry I don't understand. So I subscribed to :

- https://www.reddit.com/r/learnpython/
- amd continuing education here https://automatetheboringstuff.com/chapter11/

Regards
Let me explain:
  • On post 2 I asked you to insert a print statement on a particular line.
  • To rerun your code
  • And post the results of the print statement so I could see what was under the div tag.

you are getting all that test because tour div statement has children which have their own text.
And because of this, multiple lines of text. If I could see what the div statement was composed of, then it would be possible
to drill down to the actual 'sku' text.