Python Forum
Getting a specific text inside an html with soup
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Getting a specific text inside an html with soup
#1
Hi, I apologies for the question but I am new to scrapping in python and I struggle with accessing a text inside an html. I passed the article/html through the soup but I haven't succeed in getting the text (in bold). I tried children,comments and different type of navigable string but the best I could get was getting "Google" when I am trying to use the below

link = soup.find_all('p')[i]
            article_body.append(link.string)
Thanks in advance for the help. Any suggestion would be very much appreciated

the html code below

<div class="o-teaser o-teaser--article o-teaser--small o-teaser--has-image js-teaser" data-id="3bbb6fec-88c5-11e9-a028-86cea8523dc2">
<div class="o-teaser__content">
<div class="o-teaser__meta">
<div class="o-teaser__meta-tag">
<a class="o-teaser__tag" data-trackable="teaser-tag" href="/stream/254cd19f-4724-4c89-9230-926e8201a823">Huawei Technologies Co Ltd</a>
</div>
</div>
<div class="o-teaser__heading">
<a class="js-teaser-heading-link" data-trackable="heading-link" href="/content/3bbb6fec-88c5-11e9-a028-86cea8523dc2">
<span>
<mark class="search-item__highlight">Google</mark> warns of US national security risks from Huawei ban
</span>
</a>
</div>
<p class="o-teaser__standfirst">
<a class="js-teaser-standfirst-link" data-trackable="standfirst-link" href="/content/3bbb6fec-88c5-11e9-a028-86cea8523dc2" tabindex="-1">
<span>
...
<mark class="search-item__highlight">Google</mark> has warned the Trump administration it risks compromising US national security if it pushes ahead with sweeping export restrictions on Huawei, as the technology group seeks to continue doing...
</span>
</a></p><div class="o-teaser__timestamp">
<time class="o-teaser__timestamp-date" datetime="2019-06-07T03:36:51+0000">June 7, 2019</time>
Reply


Messages In This Thread
Getting a specific text inside an html with soup - by mathieugrimbert - Jul-08-2019, 01:19 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Python Obstacles | Karate | HTML/Scrape Specific Tag and Store it in MariaDB BrandonKastning 8 3,093 Nov-22-2021, 01:38 AM
Last Post: BrandonKastning
  How to get specific TD text via Selenium? euras 3 8,656 May-14-2021, 05:12 PM
Last Post: snippsat
  HTML multi select HTML listbox with Flask/Python rfeyer 0 4,536 Mar-14-2021, 12:23 PM
Last Post: rfeyer
  Any way to remove HTML tags from scraped data? (I want text only) SeBz2020uk 1 3,414 Nov-02-2020, 08:12 PM
Last Post: Larz60+
  Help: Beautiful Soup - Parsing HTML table ironfelix717 2 2,623 Oct-01-2020, 02:19 PM
Last Post: snippsat
  Beautiful Soup (suddenly) doesn't get full webpage html j.crater 8 16,398 Jul-11-2020, 04:31 PM
Last Post: j.crater
  Requests-HTML vs Beautiful Soup - How to Choose? robin73 0 3,781 Jun-23-2020, 02:53 PM
Last Post: robin73
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to loop to next HTML/new CSV Row BrandonKastning 0 2,329 Mar-22-2020, 06:10 AM
Last Post: BrandonKastning
  How to get the href value of a specific word in the html code julio2000 2 3,146 Mar-05-2020, 07:50 PM
Last Post: julio2000
  Web crawler extracting specific text from HTML lewdow 1 3,344 Jan-03-2020, 11:21 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020