Aug-15-2018, 12:22 AM
module...:
Don't know how to improve the code to start the program. The desired outcome is a list where one string is per one mathematicians.
#! python 3 # I wonder who the five most popular mathematicians are? from requests import get from requests.exceptions import RequestException from contextlib import closing from bs4 import BeautifulSoup def simple_get(url): """ Attempts to get the content at `url` by making an HTTP GET request. If the content-type of response is some kind of HTML/XML, return the text content, otherwise return None. """ try: with closing(get(url, stream=True)) as resp: if is_good_response(resp): return resp.content else: return None except RequestException as e: log_error('Error during requests to {0} : {1}'.format(url, str(e))) return None def is_good_response(resp): """ Returns True if the response seems to be HTML, False otherwise. """ content_type = resp.headers['Content-Type'].lower() return (resp.status_code == 200 and content_type is not None and content_type.find('html') > -1) def log_error(e): """ It is always a good idea to log errors. This function just prints them, but you can make it do anything. """ print(e) def get_names(): """ Downloads the page where the list of mathematicians is found and returns a list of strings, one per mathematician """ url = 'http://www.fabpedigree.com/james/mathmen.htm' response = simple_get(url) if response is not None: html = BeautifulSoup(response, 'html.parser') names = set() # set ensures that you don’t end up with duplicate names. for li in html.select('li'): for name in li.text.split('\n'): if len(name) > 0: names.add(name.strip()) return list(names) # Raise an exception if we failed to get any data from the url raise Exception(f'Error retrieving contents at {format(url)}')...used to run program...:
from bs4 import BeautifulSoup from mathematicians import get_names raw_html = get_names('http://www.fabpedigree.com/james/mathmen.htm') html = BeautifulSoup(raw_html, 'html.parser') for i, li in enumerate(html.select('li')): print(i, li.text)...but error appears:
Error:Traceback (most recent call last):
File "C:\Python37\kodovi\mathlist1.py", line 4, in <module>
raw_html = get_names('http://www.fabpedigree.com/james/mathmen.htm')
TypeError: get_names() takes 0 positional arguments but 1 was given
I tried to add an argument in module in definition of function get_names() but then other error appears.Don't know how to improve the code to start the program. The desired outcome is a list where one string is per one mathematicians.