Python Forum

Full Version: 403 Forbidden Error
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
How to avoid this error while crawling any website?

from bs4 import BeautifulSoup
import requests

source = requests.get("https://www.hltv.org/")
print(source.status_code)
Output:
403
(Jun-20-2020, 06:19 AM)Evil_Patrick Wrote: [ -> ]How to avoid this error while crawling any website?
Using a header user_agent is one way.
from bs4 import BeautifulSoup
import requests

user_agent = {'User-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'}
source = requests.get("https://www.hltv.org/", headers=user_agent)
print(source.status_code)
Output:
200
Sites like this use JavaScript heavy,so using Selenium may be needed to get result without to much work.