Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Webscraping homework
#1
Hi all, I really need some help with a Python assigment. The assignment consists in writing a Python code to web scrape data and assemble a dataset.

The goal of the python code will be to automatically download and save to local disk all .xls fi les available at http://www.cde.ca.gov/ds/sp/ai/ . These are clustered in three groups:
• SAT test results 1998-2016
• ACT test results 1998-2016
• AP test results 1998-2016
each consisting of 18 individual .xls files. The scraper should parse the webpage (with requests.get()), identify the links of the files (with BeautifulSoup()), and save the contents of each individual URL in a distinct file (with requests.get() and .write()).
Reply
#2
What do you have so far? What errors are you getting?

We can help, but we won't just do it for you lol
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Webscraping news articles by using selenium cate16 7 2,954 Aug-28-2023, 09:58 AM
Last Post: snippsat
  Webscraping with beautifulsoup cormanstan 3 1,852 Aug-24-2023, 11:57 AM
Last Post: snippsat
  Webscraping returning empty table Buuuwq 0 1,351 Dec-09-2022, 10:41 AM
Last Post: Buuuwq
  WebScraping using Selenium library Korgik 0 1,019 Dec-09-2022, 09:51 AM
Last Post: Korgik
  How to get rid of numerical tokens in output (webscraping issue)? jps2020 0 1,914 Oct-26-2020, 05:37 PM
Last Post: jps2020
  Python Webscraping with a Login Website warriordazza 0 2,571 Jun-07-2020, 07:04 AM
Last Post: warriordazza
  Help with basic webscraping Captain_Snuggle 2 3,875 Nov-07-2019, 08:07 PM
Last Post: kozaizsvemira
  Can't Resolve Webscraping AttributeError Hass 1 2,259 Jan-15-2019, 09:36 PM
Last Post: nilamo
  How to exclude certain links while webscraping basis on keywords Prince_Bhatia 0 3,197 Oct-31-2018, 07:00 AM
Last Post: Prince_Bhatia
  Intro to WebScraping d1rjr03 2 3,406 Aug-15-2018, 12:05 AM
Last Post: metulburr

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020