Python Forum
Trouble selecting attribute with beautiful soup - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Trouble selecting attribute with beautiful soup (/thread-36224.html)



Trouble selecting attribute with beautiful soup - bananatoast - Jan-29-2022

I'm writing my first program to download images from a website and have run into a roadblock.
I can't seem to get the src attribute in the variable 'img_url' which contains the div that holds it.

after run the code i get none in the print result instead of the url

import requests
from bs4 import BeautifulSoup

count = 1
url = f'https://xkcd.com/{count}/'
file_name = f'savedimage0{count}.png'

while True:
    page = requests.get(url)
    print(f"Status code: {page.status_code} - page read successfully!")
    soup = BeautifulSoup(page.content, 'html.parser')
    img_url = soup.find(id='comic')
    img_url = img_url.get('src')
    print(img_url)
when i print img_url before using img_url.get('src') this is what displays:

<div id="comic">
<img alt="Barrel - Part 1" src="//imgs.xkcd.com/comics/barrel_cropped_(1).jpg" style="image-orientation:none" title="Don't we all."/>
</div>


RE: Trouble selecting attribute with beautiful soup - bananatoast - Jan-29-2022

I figured it out. It only works when you have a single element in the variable.
I my case the img element was nested in the div.

i ended up selecting the div, then the img, and after that it let me select the attribute

while True:
    page = requests.get(url)
    print(f"Status code: {page.status_code} - page read successfully!")
    soup = BeautifulSoup(page.content, 'html.parser')
    comic_container = soup.find(id='comic')
    img_container = comic_container.find('img')
    img_url = img_container.get('src')



RE: Trouble selecting attribute with beautiful soup - Larz60+ - Jan-29-2022

FYI
Searching for id only may get you in trouble if there are more than one tag with the same id.
It's better to use: comic_container = soup.find('div', {'id': 'comic'})


RE: Trouble selecting attribute with beautiful soup - bananatoast - Jan-30-2022

(Jan-29-2022, 08:03 PM)Larz60+ Wrote: FYI
Searching for id only may get you in trouble if there are more than one tag with the same id.
It's better to use: comic_container = soup.find('div', {'id': 'comic'})

Ok Thanks for the tip :)