you've got it most of the way. your find_all h2s are including the h2 tag, so you need to find the a tag after that
h2.a
or h2.find('a')
and then you need to get the text and strip all whitespace from the outer edges of it. from bs4 import BeautifulSoup html = ''' <div> <h2> <a href='xxxx'> The content I want to print out 1 </a> </h2> <div> <div> <h2> <a href='xxxx'> The content I want to print out 2 </a> </h2> <div> ''' soup = BeautifulSoup(html,"html.parser") h2s = soup.find_all("h2") for h2 in h2s: print(h2.a.text.strip())
Output:The content I want to print out 1
The content I want to print out 2
if you wanted to get the actual linkprint(h2.a['href'])
Output:xxxx
xxxx
Recommended Tutorials: