*Beginner* web scraping/Beautiful Soup help

7ken8 · (This post was last modified: Jan-28-2021, 11:50 AM by buran.)

Hello all!

I am trying to scrape a table of reviews from an album’s wikipedia page, using Beautiful Soup and requests. I have become stuck trying to visualise this.

It is the "Critical Reception" table on the page for the Ed Sheeran 2017 album "%". When I inspect this is says it is a 'wikitable floatright', but I can not understand what kind of data the words are. https://en.wikipedia.org/wiki/%C3%B7_(album)

My code so far has been

import requests
from bs4 import BeautifulSoup

soup = BeautifulSoup(response.text)

url1 = “÷ (album) - Wikipedia”
s = requests.Session()
response = s.get(url1, timeout = 10)
response


right_table = soup.find(‘table’, {“class”: ‘wikitablefloatright’})


header = [th.text.rstrip() for th in right_table [0].find_all(‘th’)]
print(header)
print(’------’)
print(len(header))

The final cell writes ‘NoneType’ object is not subscriptable.

Here is the inspection for the table. Let me know if anything is unclear - I am a beginner.

Many thanks,

buran write Jan-28-2021, 11:50 AM:
Please, use proper tags when post code, traceback, output, etc. This time I have added tags for you.
See BBcode help for more info.

**buran** · (This post was last modified: Jan-28-2021, 12:01 PM by buran.)

there are multiple issues with the code you posted, to the extent it will never run, nor produce any erroro

import requests
from bs4 import BeautifulSoup
url = "https://en.wikipedia.org/wiki/%C3%B7_(album)"
response = requests.get(url, timeout = 10)
soup = BeautifulSoup(response.text, 'html.parser')
right_table = soup.find('table', {'class': 'wikitable floatright'})
header = [th.text.rstrip() for th in right_table.find_all('th')]
print(header)
print('------')
print(len(header))

Output:['Aggregate scores', 'Source', 'Rating', 'Review scores', 'Source', 'Rating']
------
6

7ken8 · Jan-28-2021, 04:26 PM

(Jan-28-2021, 10:28 AM)7ken8 Wrote: Hello all!

I am trying to scrape a table of reviews from an album’s wikipedia page, using Beautiful Soup and requests. I have become stuck trying to visualise this.

It is the "Critical Reception" table on the page for the Ed Sheeran 2017 album "%". When I inspect this is says it is a 'wikitable floatright', but I can not understand what kind of data the words are. https://en.wikipedia.org/wiki/%C3%B7_(album)

My code so far has been
import requests
from bs4 import BeautifulSoup

soup = BeautifulSoup(response.text)

url1 = “÷ (album) - Wikipedia”
s = requests.Session()
response = s.get(url1, timeout = 10)
response


right_table = soup.find(‘table’, {“class”: ‘wikitablefloatright’})


header = [th.text.rstrip() for th in right_table [0].find_all(‘th’)]
print(header)
print(’------’)
print(len(header))
The final cell writes ‘NoneType’ object is not subscriptable.

Here is the inspection for the table. Let me know if anything is unclear - I am a beginner.

Many thanks,

Hi Buran,

Thank you for your help, I was just wondering how in that box I can present the td name for the scores. As some publications are shown in img., but on inspection it does show the stars out of five in its description names. How can I present these as figures? Thanks

PS thank you for the tag notes.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Beautiful Soup - access a rating value in a class	KatMac	1	3,461	Apr-16-2021, 01:27 PM Last Post: snippsat
	Help: Beautiful Soup - Parsing HTML table	ironfelix717	2	2,672	Oct-01-2020, 02:19 PM Last Post: snippsat
	Beautiful Soup (suddenly) doesn't get full webpage html	j.crater	8	16,811	Jul-11-2020, 04:31 PM Last Post: j.crater
	Requests-HTML vs Beautiful Soup - How to Choose?	robin73	0	3,813	Jun-23-2020, 02:53 PM Last Post: robin73
	looking for direction - scrappy, crawler, beautiful soup	Sly_Corn	2	2,447	Mar-17-2020, 03:17 PM Last Post: Sly_Corn
	Beautiful soup truncates results	jonesjoz	4	3,870	Mar-09-2020, 06:04 PM Last Post: jonesjoz
	Beautiful soup and tags	starter_student	11	6,164	Jul-08-2019, 03:41 PM Last Post: starter_student
	Beautiful Soup find_all()	kirito85	2	3,357	Jun-14-2019, 02:17 AM Last Post: kirito85
	[split] Using beautiful soup to get html attribute value	moski	6	6,284	Jun-03-2019, 04:24 PM Last Post: moski
	Using beautiful soup to get html attribute value	graham23s	2	18,078	Apr-23-2019, 09:21 PM Last Post: graham23s

Beginner web scraping/Beautiful Soup help

User Panel Messages

Announcements