(Apr-12-2021, 03:06 PM)jimsxxl Wrote: So basiclly request_html is the same as Selenium with headless-option (as far as getting html code) ?Resource wise it will be the same as request_html use pyppeteer(headless) chrome/chromium browser automation.
(Apr-12-2021, 03:06 PM)jimsxxl Wrote: If i would choose to use Selenium this time, would BeautifulSoup be unnessecary then?It's fine to send
I wanted to learn Bs4 in this project, would it be foolish to combine Selenium and BS4 ?
browser.page_source
to Bs4 and then do parsing with Bs4.Example:
from selenium import webdriver from selenium.webdriver.chrome.options import Options from bs4 import BeautifulSoup import time #--| Setup options = Options() options.add_argument("--headless") #options.add_argument("--window-size=1980,1020") browser = webdriver.Chrome(executable_path=r'C:\cmder\bin\chromedriver.exe', options=options) #--| Parse or automation url = "https://www.youtube.com/channel/UCwTrHPEglCkDz54iSg9ss9Q/videos" browser.get(url) # Send to BS soup = BeautifulSoup(browser.page_source, 'lxml') title = soup.select_one('#video-title') print(title.text)
Output:WE FISH THE SAME SPOT FOR 12 HOURS - Amazing Results!! | Team Galant