Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 403 Forbidden Error
#1
How to avoid this error while crawling any website?


from bs4 import BeautifulSoup
import requests

source = requests.get("https://www.hltv.org/")
print(source.status_code)

Output:
403
Quote
#2
(Jun-20-2020, 06:19 AM)Evil_Patrick Wrote: How to avoid this error while crawling any website?
Using a header user_agent is one way.
from bs4 import BeautifulSoup
import requests

user_agent = {'User-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'}
source = requests.get("https://www.hltv.org/", headers=user_agent)
print(source.status_code)
Output:
200
Sites like this use JavaScript heavy,so using Selenium may be needed to get result without to much work.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  HTTPError: Forbidden when try download image b33g33 8 12,159 Jan-21-2017, 12:42 PM
Last Post: scriptso

Forum Jump:


Users browsing this thread: 1 Guest(s)