Python Forum
Problem with scraping the Title from a web page
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem with scraping the Title from a web page
#1
Hi. I'm using python 3.10 with Windows.

I have a script like this


    src_url = get_url_by_pagenumber(page)
    res = requests.get(src_url)
    soup = BeautifulSoup(res.text,'lxml')
    internal_pages = soup.select('h4.title a')
    records = []
    for internal_page in internal_pages:
      page_url = get_url_by_href(internal_page['href'])
      s.title = sanitize_filename(internal_page['title'])
      headers = {
        'user-agent':'*user used*'
      }
      res2 = requests.get(page_url,headers=headers)
      soup2 = BeautifulSoup(res2.text,'lxml')
      try:
        s.title = sanitize_filename(internal_page['title'])
I didn't attached the whole script just the interested part.
What I want to do with this script is to insert the the url of an homepage and then the script visit all the internal pages and extract the title of the webpage.
The problem is, if I run the script as it is, I get a key error for the Title tag. This is the full error

Error:
Traceback (most recent call last): File "C:\Users\Administrator\Desktop\script.py", line 175, in <module> s.title = internal_page['title'] File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\bs4\element.py", line 1486, in __getitem__ return self.attrs[key] KeyError: 'title'
If I try to replace (internal_page['title']) with (internal_page['href']) then the whole script is working fine. Any idea why title gives me this error?

EDIT: I forgot to add that title is present in the html but it is under all the meta property and not at the top of the page as usual.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Take data from web page problem codeweak 5 962 Nov-01-2023, 12:29 AM
Last Post: codeweak
  Python SSL web page scraping Vadanane 1 965 Jan-13-2023, 04:11 PM
Last Post: snippsat
Brick Javascript based web page scraping amjadraza26 1 1,490 Oct-21-2021, 09:36 AM
Last Post: Larz60+
  scraping a table from an http page vchealy 1 1,754 Jun-10-2021, 09:48 AM
Last Post: Larz60+
  How to change font size of chart title and axis title ? thrupass 5 15,675 Mar-30-2018, 04:02 PM
Last Post: DrFunn1

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020