Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Beautifulsoap can't get page title
#1
Hi
I can't get the tilte of the following page
http://cheirabem.com/mostraperfumes.aspx...Balenciaga
here is my code:
from bs4 import BeautifulSoup
import urllib.request
page3 = urllib.request.urlopen("http://cheirabem.com/mostraperfumes.aspx?marca=Balenciaga").read()
soup3 = BeautifulSoup(page3, "lxml")
titulo=soup3.findAll(attrs={"name":"title"})
print (titulo[0]['content'])
Any help with this issue?
Thank you
Reply
#2
from bs4 import BeautifulSoup
import requests

url = 'http://cheirabem.com/mostraperfumes.aspx?marca=Balenciaga'
url_get = requests.get(url)
soup = BeautifulSoup(url_get.content, 'lxml')
print(soup.find('title').text.strip())
Output:
Balenciaga loja online perfumes, os seus perfumes mais baratos
So use Requests and not urllib,some more info in tutorial here.
Reply
#3
Thank you. Problem solved
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  <title> django page title dynamic and other field (not working) lemonred 1 2,073 Nov-04-2021, 08:50 PM
Last Post: lemonred
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 3,580 Mar-19-2020, 06:13 PM
Last Post: apollo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020