get function returns None from Beautifulsoup object - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: Web Scraping & Web Development (https://python-forum.io/forum-13.html) +--- Thread: get function returns None from Beautifulsoup object (/thread-20266.html) |
get function returns None from Beautifulsoup object - DeanAseraf1 - Aug-02-2019 I'm trying to list all titles from a specific Wikipedia page for some reason when i apply the .get function on the Beautifulsoup object to get all the 'id's, it returns None. this is my code: import requests from bs4 import BeautifulSoup def spider(max_pages): page = 1 while page <= max_pages: main_page = 'https://wikipedia.org/wiki/' search = input("Enter your search: ") page_to_search = main_page + str(search) source_code = requests.get(page_to_search) plain_text = source_code.text soup = BeautifulSoup(plain_text, features="html.parser") for title in soup.findAll('h2'): print(title) ids = link.get('id') print(ids) page += 1 spider(1)this is the output: I've tried to get the links instead and it works the same way,i also searched online and tried to change the code a little and it doesn't seem to change anything. Python 3.7 Windows 10 Pycharm community edition 2019.2 BeautifulSoup4 requests RE: get function returns None from Beautifulsoup object - Larz60+ - Aug-03-2019 As shown, this code will not run. link is not defined. what id's are you looking for in what tags? the only element that you are finding is the title RE: get function returns None from Beautifulsoup object - DeanAseraf1 - Aug-03-2019 i'm trying to make a program that you can input what you want to search in Wikipedia and it prints out all of the titles in the page. the id's is the name of the titles in wikipedia source. the class called "mw-headline" i also tried this this instead for title in soup.findAll('h2', {"class": "mw-headline"}): print(title) ids = title.get('id') print(ids)example for what i want:
RE: get function returns None from Beautifulsoup object - DeanAseraf1 - Aug-03-2019 I managed to fix the code. here is the new one: import requests from bs4 import BeautifulSoup def spider(max_pages): page = 1 while page <= max_pages: main_page = 'https://wikipedia.org/wiki/' search = input("Enter your search: ") page_to_search = main_page + str(search) source_code = requests.get(page_to_search) plain_text = source_code.text soup = BeautifulSoup(plain_text, features="html.parser") for menu in soup.findAll('span', class_='toctext'): print(menu.text) spider(1) |