Python Forum

Full Version: Trouble selecting attribute with beautiful soup
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm writing my first program to download images from a website and have run into a roadblock.
I can't seem to get the src attribute in the variable 'img_url' which contains the div that holds it.

after run the code i get none in the print result instead of the url

import requests
from bs4 import BeautifulSoup

count = 1
url = f'https://xkcd.com/{count}/'
file_name = f'savedimage0{count}.png'

while True:
    page = requests.get(url)
    print(f"Status code: {page.status_code} - page read successfully!")
    soup = BeautifulSoup(page.content, 'html.parser')
    img_url = soup.find(id='comic')
    img_url = img_url.get('src')
    print(img_url)
when i print img_url before using img_url.get('src') this is what displays:

<div id="comic">
<img alt="Barrel - Part 1" src="//imgs.xkcd.com/comics/barrel_cropped_(1).jpg" style="image-orientation:none" title="Don't we all."/>
</div>
I figured it out. It only works when you have a single element in the variable.
I my case the img element was nested in the div.

i ended up selecting the div, then the img, and after that it let me select the attribute

while True:
    page = requests.get(url)
    print(f"Status code: {page.status_code} - page read successfully!")
    soup = BeautifulSoup(page.content, 'html.parser')
    comic_container = soup.find(id='comic')
    img_container = comic_container.find('img')
    img_url = img_container.get('src')
FYI
Searching for id only may get you in trouble if there are more than one tag with the same id.
It's better to use: comic_container = soup.find('div', {'id': 'comic'})
(Jan-29-2022, 08:03 PM)Larz60+ Wrote: [ -> ]FYI
Searching for id only may get you in trouble if there are more than one tag with the same id.
It's better to use: comic_container = soup.find('div', {'id': 'comic'})

Ok Thanks for the tip :)