Oct-09-2018, 08:45 PM
Hi All,
I'm trying to perform what I thought would be a simple web scrapping task, but am running into an issue I am unable to figure out. The page is a '.aspx' which I suspect has something to do with it.
The task:
(i) Get the name of the school + all the links for elementary schools on this webpage - http://www.yrdsb.ca/Schools/Pages/default.aspx
(ii) On an individual school's webpage, retrieve the grades, address, phone, fax, email, and bell times. Example page for a school: http://www.yrdsb.ca/Schools/Pages/School...hoolID=227
I am able to load the webpage into bs4; but, from that point I am unable to use css selectors or anything I can think of to locate the data I am looking for in the response page.
Does anyone have any clever ideas???
I'm trying to perform what I thought would be a simple web scrapping task, but am running into an issue I am unable to figure out. The page is a '.aspx' which I suspect has something to do with it.
The task:
(i) Get the name of the school + all the links for elementary schools on this webpage - http://www.yrdsb.ca/Schools/Pages/default.aspx
(ii) On an individual school's webpage, retrieve the grades, address, phone, fax, email, and bell times. Example page for a school: http://www.yrdsb.ca/Schools/Pages/School...hoolID=227
I am able to load the webpage into bs4; but, from that point I am unable to use css selectors or anything I can think of to locate the data I am looking for in the response page.
Does anyone have any clever ideas???
from bs4 import BeautifulSoup import urllib3 url = "http://www.yrdsb.ca/Schools/Pages/School-Profile.aspx?SchoolID=227" http = urllib3.PoolManager() r = http.request('get', url) soup = BeautifulSoup(r.data, 'lxml')