Dec-29-2017, 07:50 PM
Hello, I am new to coding and have tried to make a web crawler to try to retrieve the urls off a web page of my choosing. However, when I run the program it does not print the urls yet it does not say there is any specific error. If someone could take a look at my code and tell me what I have done wrong your help would be greatly appreciated.(note: the code is being run in the latest version of the IDE Pycharm if that is relevant)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
import requests from bs4 import BeautifulSoup def spider(max_pages): page = 1 while page < = max_pages: url = "https://www.google.com/search?q=making+a+clock+in+python&ie=utf-8&oe=utf-8&client=firefox-b-1" + str (page) #url here source_code = requests.geturl plain_text = source_code.txt soup = BeautifulSoup(plain_text) for link in soup.findAll( "a" , { "class" : "rc" }): print (href) spider( 1 ) |