Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help with basic webscraping
#1
Hi guys,
I have recently started learning Python, and have now started learning webscraping with BeautifulSoup. The following code is not working for me in Pycharm. I hope someone can help me with my problem. It would be greatly appreciated :)

import requests
from bs4 import BeautifulSoup
data=requests.get('https://umggaming.com/leaderboards')
soup=BeautifulSoup(data.text, 'html.parser')

leaderboard=soup.find('div', { 'id': 'leaderboard-table' })

tbody=leaderboard.find('tbody')

for tr in tbody.find_all('tr'):
    place=tr.find_all('td')[0].text.strip()
    username=tr.find_all('td')[1].find_all('a')[1].text.strip()
    xp=tr.find_all('td')[3].text.strip()
    print(place, username, xp)
I get the following error note:
 tbody=leaderboard.find('tbody')
AttributeError: 'NoneType' object has no attribute 'find'
Reply
#2
Quote:I get the following error note:
Please always post complete and unaltered error traceback (in code tags), it contains very valuable debugging information.

That website requires browser access, and is returning a 503 error when called from code.

I think you might be able to scrape this using selenium.
Reply
#3
Now I'm not sure what you are trying to do, but here's working code

Your first error came when you are calling soup you call it from var soup not last var you called soup.

You did:
tbody=leaderboard.find('tbody')
You need:
tbody = soup.find('tbody')
And you will get second error list index out of range unless you change [1] and [1] to [0] and [0] like in code below.
And please use spaces between = and ,
Hope it helps.

import requests
from bs4 import BeautifulSoup
data=requests.get('https://umggaming.com/leaderboards')
soup=BeautifulSoup(data.text, 'html.parser')
 
leaderboard = soup.find('div', { 'id': 'leaderboard-table' })
 
tbody = soup.find('tbody')
 
for tr in soup.find_all('tr'):
    place=tr.find_all('td')[0].text.strip()
    username=tr.find_all('td')[0].find_all('a')[0].text.strip()
    xp=tr.find_all('td')[0].text.strip()
    print(place, username, xp)
I'm assuming by having [1] in your code you are trying to identify second element of <td> ? If so it won't work with [] I use re findall
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscraping news articles by using selenium cate16 7 3,100 Aug-28-2023, 09:58 AM
Last Post: snippsat
  Webscraping with beautifulsoup cormanstan 3 1,947 Aug-24-2023, 11:57 AM
Last Post: snippsat
  Webscraping returning empty table Buuuwq 0 1,392 Dec-09-2022, 10:41 AM
Last Post: Buuuwq
  WebScraping using Selenium library Korgik 0 1,043 Dec-09-2022, 09:51 AM
Last Post: Korgik
  How to get rid of numerical tokens in output (webscraping issue)? jps2020 0 1,936 Oct-26-2020, 05:37 PM
Last Post: jps2020
  Python Webscraping with a Login Website warriordazza 0 2,600 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Can't Resolve Webscraping AttributeError Hass 1 2,295 Jan-15-2019, 09:36 PM
Last Post: nilamo
  How to exclude certain links while webscraping basis on keywords Prince_Bhatia 0 3,232 Oct-31-2018, 07:00 AM
Last Post: Prince_Bhatia
  Webscraping homework Ghigo1995 1 2,642 Sep-23-2018, 07:36 PM
Last Post: nilamo
  Intro to WebScraping d1rjr03 2 3,446 Aug-15-2018, 12:05 AM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020