Web scraping: os.path.basename

Truman · (This post was last modified: Aug-23-2018, 11:22 PM by Truman.)

Thank you, now doing more advanced download from number of pages...

import requests 
from bs4 import BeautifulSoup
import os
import webbrowser 
browser_path = r"C:\Program Files (x86)\Mozilla Firefox\firefox.exe"
webbrowser.register('mozzila', None, webbrowser.BackgroundBrowser(browser_path))

def image_down(start_img, stop_imp):
	for numb in range(start_img, stop_img):
		url = f'http://xkcd.com/{numb}'
		url_get = requests.get(url)
		soup = BeautifulSoup(url_get.content, 'html.parser')
		link = soup.find('div', id='comic').find('img').get('src')
		link = link.replace('//', 'http://')
		img_name = os.path.basename(link)
		webbrowser.get('mozzila').open_new_tab(img_name)
		#try:
			#img = requests.get(link)
			#with open(img_name, 'wb') as f_out:
				#f_out.write(img.content)
		#except:
			# Just want images don't care about errors
			#pass
			
if __name__ == '__main__':
	start_img = 1
	stop_img = 5
	image_down(start_img, stop_img)

It opens only the first image in the first tab and for the rest in other 3 tabs it says that server is not found.

solved it. Just changed line 16 to

webbrowser.get('mozzila').open_new_tab(link)

Ok, now it's all clear.

Web scraping: os.path.basename

User Panel Messages

Announcements