Need help scraping wikipedia table

bborusz2 · (This post was last modified: Dec-01-2020, 04:59 PM by bborusz2.)

Hey guys,

I am fairly new to python and how to use it. I have been attempting to use Beautiful soup to scrape a wikipedia table, https://en.wikipedia.org/wiki/List_of_ne...in_Chicago, and am having a lot of difficulty in doing so. I keep getting an empty df with just column headers. Can someone please help walk through the code that is necessary to scrape this table with me. I would greatly appreciate it. Again, don't need someone to do it for me, but would like someone to talk me through it.

Here is what I've tried so far

Thanks! Hoping this site works!

**Larz60+** · Dec-01-2020, 05:21 PM

Please do not post links to code.
Post the code within the thread, using bbcode tags

***snippsat*** · Dec-01-2020, 05:27 PM

Pandas can scape tables directly to Dataframe,so don't need BS for this.
Example

bborusz2 · Dec-01-2020, 05:34 PM

Sorry about that, still learning. Here is the code:

from bs4 import BeautifulSoup
import numpy as np 
import requests
import pandas as pd 

list_url = "https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Chicago"
source = requests.get(list_url)

soup = BeautifulSoup(source.text, 'html.parser')

neighborhood_table=soup.find('table')

df=pd.read_html(str(neighborhood_table))

df.head()
[error]---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-62-c42a15b2c7cf> in <module>
----> 1 df.head()

AttributeError: 'list' object has no attribute 'head'[/error]

bborusz2 · (This post was last modified: Dec-01-2020, 11:03 PM by bborusz2.)

(Dec-01-2020, 05:21 PM)Larz60+ Wrote: Please do not post links to code.
Post the code within the thread, using bbcode tags

I followed those instructions, did that work better?

bborusz2 · Dec-01-2020, 11:03 PM

(Dec-01-2020, 05:27 PM)snippsat Wrote: Pandas can scape tables directly to Dataframe,so don't need BS for this.
Example

I clicked on the link and was told I need authorization, can you please recommend next steps. Thank you.

***snippsat*** · Dec-01-2020, 11:31 PM

Try now link.

import pandas as pd

df = pd.read_html("https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Chicago")
df = df[0]
print(df.head())

Output:      Neighborhood  Community area
0      Albany Park     Albany Park
1  Altgeld Gardens       Riverdale
2    Andersonville       Edgewater
3   Archer Heights  Archer Heights
4    Armour Square   Armour Square

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Help Scraping links and table from link	cartonics	11	1,560	Oct-12-2023, 06:42 AM Last Post: cartonics
	Scraping Wikipedia Article (Name in 1 column & URL in 2nd column) ->CSV! Anyone?	BrandonKastning	4	2,007	Jan-27-2022, 04:36 AM Last Post: Larz60+
	Scraping data from table into existing dataframe	vincer58	1	2,008	Jan-09-2022, 05:15 PM Last Post: vincer58
	fetching, parsing data from Wikipedia	apollo	2	3,538	May-06-2021, 08:08 PM Last Post: snippsat
	Web Scraping Inquiry (Extracting content from a table in asubdomain)	DustinKlent	3	3,709	Aug-17-2020, 10:10 AM Last Post: snippsat
	Scraping a dynamic data-table in python through AJAX request	filozofo	1	3,882	Aug-14-2020, 10:13 AM Last Post: kashcode
	scraping multiple pages from table	bandar	1	2,685	Jun-27-2020, 10:43 PM Last Post: Larz60+
	table from wikipedia	flow50	5	5,424	Jul-01-2019, 07:12 PM Last Post: snippsat
	Web scraping "fancy" table	acehole60	2	4,906	Dec-16-2016, 09:17 AM Last Post: acehole60

Need help scraping wikipedia table

User Panel Messages

Announcements