Jul-11-2020, 02:04 PM
(Jul-11-2020, 11:52 AM)j.crater Wrote: Thank you both for answers.
@HarleyQuin
The code I ran months ago was same as I posted here, but result was not same. As stated, on my first attempt I got all the HTML contents, while this time I didn't. Also, replacing the parser for lxml parser didn't make a difference. Do you have any idea, from experience, why such difference?
Hey again,
From experience i have noticed that not using a user-agent/header makes it very easy for YouTube to immediately identify you as a web scraper and deal with your request connection differently to how a conventional user may be welcomed by the site. That is something that made a difference when i first started scraping.
e.g. i use this in my code:
import requests headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36', "Content-Type": "application/x-www-form-urlencoded"} url = "https://whatsmyua.info/" webpage = requests.get(url, headers=headers).text print(webpage)Sorry if i have been of no use!
I hope you solve your issue buddy,
Regards,
Harley