(Apr-11-2021, 01:38 PM)jimsxxl Wrote: Hello guys,
Im messing around abit with bs4, im trying to parse some data from Youtube as a "learning-project".
What im finding difficult to understand is, when searching for a element to parse (for example video title)...
what should i be looking at? What is the key to get video-title extracted from the HTML code?
How should i think when i inspect an object in my browser?
What piece of code am i interested in ?
Thank you in advance !
(Apr-12-2021, 10:51 AM)snippsat Wrote: If it work with requests_htm then it's okay.
I have only tested requests_htm(problem not updated regularly Github Repo) briefly,can also use Selenuim and load browser with--headless
option.
requests_htm use pyppeteer which is defaultheadless
.
Some time is useful the see browser before go headless like see if push button or enter into field,
then Selenium can be better choice.
from selenium import webdriver from selenium.webdriver.chrome.options import Options #--| Setup options = Options() options.add_argument("--headless") #options.add_argument("--window-size=1980,1020") browser = webdriver.Chrome(executable_path=r'C:\cmder\bin\chromedriver.exe', options=options) #--| Parse or automation url = "https://www.youtube.com/channel/UCwTrHPEglCkDz54iSg9ss9Q/videos" browser.get(url) title = browser.find_elements_by_css_selector('#text-container')[0] print(title.text)The fasted way is using the YouTube API.
Output:kanalgratisdotse
import requests channel_id = 'UCwTrHPEglCkDz54iSg9ss9Q' api_key = 'xxxxxxxxxxxxxxxxxxx' url = f'https://www.googleapis.com/youtube/v3/channels?id={channel_id}&part=snippet&key={api_key}' response = requests.get(url).json() print(response['items'][0]['snippet']['title'])
Output:kanalgratisdotse
Hi again snippsat!
Yeah, ive tried the —headless option in Selenium.
So basiclly request_html is the same as Selenium with headless-option (as far as getting html code) ?
I thought request_html was ”lighter” than Selenium for some reason, thats why i chose it.
If i would choose to use Selenium this time, would BeautifulSoup be unnessecary then?
I wanted to learn Bs4 in this project, would it be foolish to combine Selenium and BS4 ?
Thanks alot for your replys snippsat !