Help with basic webscraping

Captain_Snuggle · (This post was last modified: Nov-07-2019, 02:37 PM by Captain_Snuggle.)

Hi guys,
I have recently started learning Python, and have now started learning webscraping with BeautifulSoup. The following code is not working for me in Pycharm. I hope someone can help me with my problem. It would be greatly appreciated :)

import requests
from bs4 import BeautifulSoup
data=requests.get('https://umggaming.com/leaderboards')
soup=BeautifulSoup(data.text, 'html.parser')

leaderboard=soup.find('div', { 'id': 'leaderboard-table' })

tbody=leaderboard.find('tbody')

for tr in tbody.find_all('tr'):
    place=tr.find_all('td')[0].text.strip()
    username=tr.find_all('td')[1].find_all('a')[1].text.strip()
    xp=tr.find_all('td')[3].text.strip()
    print(place, username, xp)

I get the following error note:

 tbody=leaderboard.find('tbody')
AttributeError: 'NoneType' object has no attribute 'find'

**Larz60+** · (This post was last modified: Nov-07-2019, 03:15 PM by Larz60+.)

Quote:I get the following error note:

Please always post complete and unaltered error traceback (in code tags), it contains very valuable debugging information.

That website requires browser access, and is returning a 503 error when called from code.

I think you might be able to scrape this using selenium.

kozaizsvemira · (This post was last modified: Nov-07-2019, 08:07 PM by kozaizsvemira.)

Now I'm not sure what you are trying to do, but here's working code

Your first error came when you are calling soup you call it from var soup not last var you called soup.

You did:

tbody=leaderboard.find('tbody')

You need:

tbody = soup.find('tbody')

And you will get second error list index out of range unless you change [1] and [1] to [0] and [0] like in code below.
And please use spaces between = and ,
Hope it helps.

import requests
from bs4 import BeautifulSoup
data=requests.get('https://umggaming.com/leaderboards')
soup=BeautifulSoup(data.text, 'html.parser')
 
leaderboard = soup.find('div', { 'id': 'leaderboard-table' })
 
tbody = soup.find('tbody')
 
for tr in soup.find_all('tr'):
    place=tr.find_all('td')[0].text.strip()
    username=tr.find_all('td')[0].find_all('a')[0].text.strip()
    xp=tr.find_all('td')[0].text.strip()
    print(place, username, xp)

I'm assuming by having [1] in your code you are trying to identify second element of <td> ? If so it won't work with [] I use re findall

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Webscraping news articles by using selenium	cate16	7	3,100	Aug-28-2023, 09:58 AM Last Post: snippsat
	Webscraping with beautifulsoup	cormanstan	3	1,947	Aug-24-2023, 11:57 AM Last Post: snippsat
	Webscraping returning empty table	Buuuwq	0	1,392	Dec-09-2022, 10:41 AM Last Post: Buuuwq
	WebScraping using Selenium library	Korgik	0	1,043	Dec-09-2022, 09:51 AM Last Post: Korgik
	How to get rid of numerical tokens in output (webscraping issue)?	jps2020	0	1,936	Oct-26-2020, 05:37 PM Last Post: jps2020
	Python Webscraping with a Login Website	warriordazza	0	2,600	Jun-07-2020, 07:04 AM Last Post: warriordazza
	Can't Resolve Webscraping AttributeError	Hass	1	2,295	Jan-15-2019, 09:36 PM Last Post: nilamo
	How to exclude certain links while webscraping basis on keywords	Prince_Bhatia	0	3,232	Oct-31-2018, 07:00 AM Last Post: Prince_Bhatia
	Webscraping homework	Ghigo1995	1	2,642	Sep-23-2018, 07:36 PM Last Post: nilamo
	Intro to WebScraping	d1rjr03	2	3,446	Aug-15-2018, 12:05 AM Last Post: metulburr

Help with basic webscraping

User Panel Messages

Announcements