Python Forum
Beautifulsoup don't get me the page
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Beautifulsoup don't get me the page
#5
First step is to try the user-agent that this site use.
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36
That dos not work as i tested it.
Next step is to use Selenium.
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.keys import Keys
import time

#--| Setup
options = Options()
#options.add_argument("--headless")
browser = webdriver.Chrome(executable_path=r'chromedriver.exe', options=options)
#--| Parse or automation
browser.get('https://www.fragrantica.com/perfume/Chanel/Coco-Eau-de-Parfum-609.html')
soup = BeautifulSoup(browser.page_source, 'lxml')
browser.implicitly_wait(5)
parfum = soup.select('#col1 > div > div > h1 > span')
Now it work,eg here i use CSS seletor to get parfum title name.
Text would be:
>>> parfum
[<span itemprop="name">Coco Eau de Parfum Chanel for women</span>]
>>> parfum[0].text
'Coco Eau de Parfum Chanel for women'
Reply


Messages In This Thread
Beautifulsoup don't get me the page - by mariolopes - Oct-20-2019, 05:55 PM
RE: Beautifulsoup don't get me the page - by snippsat - Oct-22-2019, 06:18 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  Beautifulsoup doesn't scrape page (python 2.7) Hikki 0 2,030 Aug-01-2020, 05:54 PM
Last Post: Hikki
  use Xpath in Python :: libxml2 for a page-to-page skip-setting apollo 2 3,675 Mar-19-2020, 06:13 PM
Last Post: apollo

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020