Python Forum

Full Version: Youtube page scraping
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello all,

The page I would like to scrape is a user videos page on youtube. All of the video titles in the page are nested under the same xpath which is: //*[@id='video-title'] What I would like my script to do is to create a list of titles on a loaded page and then print it, however it does not seem to work. Any advice?

from lxml import html
import requests

page = requests.get('https://www.youtube.com/user/numberphile/videos')

tree = html.fromstring(page.content)

title = tree.xpath("//*[@id='video-title']/text()")

titles = []

for f in title:
    titles.append(f)

print(titles)
Turn of JavaScript in browser and see how many videos you see Think
If you had looked at what page.content return,
you would have seen that's there no id='video-title' at all.

Selenium have more about it here.
Look at YouTube Data API v3,if can get data that way.
A post about usage.