Apr-11-2021, 05:25 PM
Thanks alot snippsat for the links... it clarified acouple of things for me!
Im using requests_html instead of requests because i noticed that requests got stuck in the "Agree to continue"-page Youtube have.
In my last Betfair project i fixed that with .click, but i wanted to see if it could be done without Selenium and loading a browser into the program.
Here is the code so far:
I would really appreciate some feedback from experianced Python-coders !
Im using requests_html instead of requests because i noticed that requests got stuck in the "Agree to continue"-page Youtube have.
In my last Betfair project i fixed that with .click, but i wanted to see if it could be done without Selenium and loading a browser into the program.
Here is the code so far:
from bs4 import BeautifulSoup as bs from requests_html import HTMLSession tempfile = "/home/xxx/projects/jims-youtube_scraper/tempvideofile.html" channels = [ 'https://www.youtube.com/channel/UCwTrHPEglCkDz54iSg9ss9Q/videos' # KanalGratis #'https://www.youtube.com/user/svartzonker/videos' # Svartzonker ] title = [] link = [] count = 0 session = HTMLSession() for c in channels: get_response = session.get(c) get_response.html.render(sleep=1) open(tempfile, "w", encoding='utf8').write(get_response.html.html) opentemp = open(tempfile, 'r') soup = bs(opentemp, 'html.parser') #name = soup.find('yt', class_='style-scope ytd-channel-name') #print(name.get('text')) for t in soup.find_all('a', class_='yt-simple-endpoint style-scope ytd-grid-video-renderer'): title.append(t.get('title')) for l in soup.find_all('a', class_='yt-simple-endpoint style-scope ytd-grid-video-renderer'): link.append(l.get('href')) while count != len(title): print("Title:", title[count], "URL: www.youtube.com" + link[count]) count = count + 1Please, let me know if i could had done it in a better way, or if something looks funny to you.
I would really appreciate some feedback from experianced Python-coders !