Python Forum
how to print out all the link <a> under each h2 section using beautifulsoup
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
how to print out all the link <a> under each h2 section using beautifulsoup
#1
Hi all,

I am a newbie on Python, forgive me if my question sounds silly. So I am trying to print out all the <a> url under each h2 section.
The html structure is like:

<div>
<h2>
<a href='xxxx'>
The content I want to print out 1
</a>
</h2>
<div>

<div>
<h2>
<a href='xxxx'>
The content I want to print out 2
</a>
</h2>
<div>



And the code I am using is like:

import requests
from bs4 import BeautifulSoup

r=requests.get("http://xxxxxxxxx/")
source_code = r.text
soup=BeautifulSoup(source_code,"html.parser").find_all("h2")
for link in soup:
print(link)


But how can I print out the results like:

The content I want to print out 1
The content I want to print out 2

Thanks a lot for the help!

BR,
Henry
Reply
#2
you've got it most of the way. your find_all h2s are including the h2 tag, so you need to find the a tag after that h2.a or h2.find('a') and then you need to get the text and strip all whitespace from the outer edges of it.

from bs4 import BeautifulSoup

html = '''
<div>
<h2>
<a href='xxxx'>
The content I want to print out 1
</a>
</h2>
<div>

<div>
<h2>
<a href='xxxx'>
The content I want to print out 2
</a>
</h2>
<div>
'''

soup = BeautifulSoup(html,"html.parser")
h2s = soup.find_all("h2")
for h2 in h2s:
    print(h2.a.text.strip())
Output:
The content I want to print out 1 The content I want to print out 2
if you wanted to get the actual link
    print(h2.a['href'])
Output:
xxxx xxxx
Reply
#3
(Feb-02-2018, 02:51 AM)metulburr Wrote: print(h2.a.text.strip())
Thank you so much for your kind help. It works!

Best Regards,
Henry
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  web scraping extract particular Div section AjayBachu 7 672 May-12-2020, 03:24 PM
Last Post: AjayBachu
  Web scraping read particular section AjayBachu 4 1,013 May-08-2020, 07:33 AM
Last Post: AjayBachu
  get link and link text from table metulburr 5 1,465 Jun-13-2019, 07:50 PM
Last Post: snippsat
  Monitor a section of a webpage for changes yeto 1 1,405 Dec-05-2017, 08:09 PM
Last Post: nilamo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020