Hi. I'm using python 3.10 with Windows.
I have a script like this
What I want to do with this script is to insert the the url of an homepage and then the script visit all the internal pages and extract the title of the webpage.
The problem is, if I run the script as it is, I get a key error for the Title tag. This is the full error
EDIT: I forgot to add that title is present in the html but it is under all the meta property and not at the top of the page as usual.
I have a script like this
src_url = get_url_by_pagenumber(page) res = requests.get(src_url) soup = BeautifulSoup(res.text,'lxml') internal_pages = soup.select('h4.title a') records = [] for internal_page in internal_pages: page_url = get_url_by_href(internal_page['href']) s.title = sanitize_filename(internal_page['title']) headers = { 'user-agent':'*user used*' } res2 = requests.get(page_url,headers=headers) soup2 = BeautifulSoup(res2.text,'lxml') try: s.title = sanitize_filename(internal_page['title'])I didn't attached the whole script just the interested part.
What I want to do with this script is to insert the the url of an homepage and then the script visit all the internal pages and extract the title of the webpage.
The problem is, if I run the script as it is, I get a key error for the Title tag. This is the full error
Error:Traceback (most recent call last):
File "C:\Users\Administrator\Desktop\script.py", line 175, in <module>
s.title = internal_page['title']
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\bs4\element.py", line 1486, in __getitem__
return self.attrs[key]
KeyError: 'title'
If I try to replace (internal_page['title']) with (internal_page['href']) then the whole script is working fine. Any idea why title gives me this error?EDIT: I forgot to add that title is present in the html but it is under all the meta property and not at the top of the page as usual.