There are several problems here,so not even close to work
Before writing more code most test that what you get back is acutely usable.
So can help write the start as this is not usable.
I gone trow awyay
Before writing more code most test that what you get back is acutely usable.
print()
always work as fast test,or here i use pprint()
then is easier to look at content. import requests import bs4 as bs import urllib.request from pprint import pprint url = 'http://legacy.lib.utexas.edu/maps/topo/indiana/' opener = urllib.request.build_opener() opener.add_headers = [{'User-Agent' : 'Mozilla'}] urllib.request.install_opener(opener) raw = requests.get(url).text soup = bs.BeautifulSoup(raw, 'html.parser') imgs = soup.find_all ('img') pprint(imgs)
Output:[<img alt="The University of Texas" src="/images/globalHeaderFooter/university_seal_informal.png"/>,
<img alt="The University of Texas" src="/images/globalHeaderFooter/UT_Libraries_RGB_inf_brand_b2-ac.svg"/>,
<img alt="" height="3" src="http://legacy.lib.utexas.edu/graphics/orange.gif" width="5"/>,
<img alt="" height="3" src="http://legacy.lib.utexas.edu/graphics/orange.gif" width="5"/>,
<img alt="" height="3" src="http://legacy.lib.utexas.edu/graphics/orange.gif" width="5"/>,
<img alt="" height="3" src="http://legacy.lib.utexas.edu/graphics/orange.gif" width="5"/>,
<img alt="" height="3" src="http://legacy.lib.utexas.edu/graphics/orange.gif" width="5"/>,
..... ect
As see this is not images links of maps that you want.So can help write the start as this is not usable.
I gone trow awyay
urllib
as that should not be used anyway.import requests from bs4 import BeautifulSoup url = 'http://legacy.lib.utexas.edu/maps/topo/indiana/' response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') maps = soup.select_one('#actualcontent > ul') map_link = maps.find_all('a') for link in map_link: print(link.get('href'))
Output:http://legacy.lib.utexas.edu/maps/topo/indexes/txu-pclmaps-topo-in-index-1925.jpg
http://legacy.lib.utexas.edu/maps/topo/indiana/txu-pclmaps-topo-in-bedford-1934.jpg
http://legacy.lib.utexas.edu/maps/topo/illinois/txu-pclmaps-topo-il-birds-1914.jpg
http://legacy.lib.utexas.edu/maps/topo/indiana/txu-pclmaps-topo-in-bloomington-1908.jpg
..... ect
So now can try to figure out how to download these image links,and you do not need to import urllib
for this.