Python Forum

Full Version: Want to extract 4 tables from webpage - Nubee Stuck :(
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi, I am trying to extract the 4 tables of cases and deaths (new+total) from this page:

https://coronavirus.data.gov.uk/#local-authorities

I have read all numerous websites about how BS works but seem to be failing at each stage. Very frustrating but I'm sure down to my own inabilities!

So, am hoping I can get some help here?

The following seem to be the code that works least badly..

import requests
from bs4 import BeautifulSoup

import pandas as pd 
import numpy as np 

url = "https://coronavirus.data.gov.uk/#local-authorities"
r = requests.get(url)


soup = BeautifulSoup(r.content, 'html.parser')

table = soup.find_all('table')
rows = table.find('cell')
row_list = list()

for tr in rows:
    td = tr.find_all('td')
    row = [i.text for i in td]
    row_list.append(row)
    

df_bs = pd.DataFrame(row_list,columns=['City','Country','Notes'])
df_bs.set_index('Country',inplace=True)
df_bs.to_csv('beautifulsoup.csv')
It gives me the following error on the line "rows = table.find('cell')"
"ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key

AttributeError: ResultSet object has no attribute 'find'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?

Am tearing my hair out!

Thanks Andrew