Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
BeautifulSoup4 plugin help
#1
Im new to Python and finding difficulty understanding some code.

when using the Beautifulsoup4 plugin, I want to search through a source code text. therefore:
for a in soup.findall('a',{'class':'item-name'}):
where soup is the variable containing soup form source code text.

What does ('a',{'class':'item-name'}) exactly mean?
Reply
#2
Quote:What does ('a',{'class':'item-name'}) exactly mean?
this means find all anchor tags whose class=item-name and insert into a list
Reply
#3
Look at this CodePen
So class or id are a reference to the CSS file.
When parsing we use this reference to find tags needed.
It's also easier to not use dictionary call,the can just copy CSS class and add _.
soup.findall('a', {'class': 'item-name'}):

# Better
soup.findall('a', class_='item-name'):
Using code in CodePen.
from bs4 import BeautifulSoup

# Simulate a web page
html = '''\
<body>
  <div id='images'>
    <a href='image1.html'>My image 1 <br/><img src='https://i.picsum.photos/id/237/200/300.jpg'/></a>
  </div>
  <div>
    <p class="car">
      <a class="color_black" href="Link to bmw">BMV black model</a>
      <a class="color_red" href="Link to opel">Opel red model</a>
    </p>
  </div>
</body>
'''
soup = BeautifulSoup(html, 'html.parser')
Test usage two way find/find_all or select/select_one where using CSS selector.
>>> soup.find('a', class_="color_red")
<a class="color_red" href="Link to opel">Opel red model</a>
>>> soup.find('a', class_="color_red").text
'Opel red model'
>>> 
>>> # Using CSS selector
>>> soup.select('.color_red')
[<a class="color_red" href="Link to opel">Opel red model</a>]
>>> soup.select_one('.color_red').text
'Opel red model'
>>> 
>>> # id 
>>> soup.select_one('#images')
<div id="images">
<a href="image1.html">My image 1 <br/><img src="https://i.picsum.photos/id/237/200/300.jpg"/></a>
</div>
>>> soup.select_one('#images').img.get('src')
'https://i.picsum.photos/id/237/200/300.jpg'
Tutorial part-1, part-2.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Beautifulsoup4 help samuelbachorik 1 1,351 Feb-05-2022, 10:44 PM
Last Post: snippsat
  Python3 + BeautifulSoup4 + lxml (HTML -> CSV) - How to write 3 Columns to MariaDB? BrandonKastning 21 6,908 Mar-23-2020, 05:51 PM
Last Post: ndc85430
  Cannot import BeautifulSoup4 as bs4 and BeautifulSoup4 not in same directory B5473829 1 6,414 Jul-05-2019, 06:55 PM
Last Post: snippsat
  BeautifulSoup4, How to get an HTML tag with specific class. Broadsworde 6 10,945 Nov-22-2018, 05:25 PM
Last Post: snippsat
  How to use BeautifulSoup4 with pandas series type of html data? PrateekG 4 4,898 Apr-26-2018, 07:33 AM
Last Post: PrateekG
  What's a good practice project for learning BeautifulSoup4, which has a real use case league55 2 2,726 Jan-27-2018, 11:29 PM
Last Post: league55

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020