Python Forum

Full Version: Scaping pages created by javascript
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi all, thanks in advance for any help or advice.

I am trying to create a script to scrape data from a website that contains status for childcare facilities in Oregon. This site is useful for parents who want to check out if their daycare is licensed, what their capacity is, etc. As the site is extremely slow to search and browse, I wanted to automatically scrape data for a set of facilities.

Here is the starting page:

Here is my dilemma:
1. Type "little birds" in for facility name and click the search button.
2. When the results are returned, click the "Select" button for the 2nd option, which should be "Little Birds Playschool" with a Licensed status of "Closed - License Type Changed"
3. This opens a new page with the URL of: https://childcaresafetyportal.ode.state....ils/202022
4. If I type to scrape that URL from step 3 or open it in a separate browser tab, I get a 404 error. It appears this page is created by a javascript so you can't just bookmark and go back to any URL.

I'm just trying to figure out if there is a way to scrape data from this portal/database. I just don't know javascript well enough to decipher what is being called and how to see if I can pass the right info to scrape data.

Again any tips or tricks or past examples you might be aware of that you could point me to would be great. THANK YOU!!!
This type of web page is best scraped using selenium.

You can quickly bring yourself up to speed by running through snippsats web scraping tutorials:
Web-Scraping part-1
Web-Scraping part-2