Python Forum
Can't figure out how to scrape grid
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Can't figure out how to scrape grid
#3
(Jun-13-2024, 09:54 PM)Larz60+ Wrote: you will need to use a scraper that can recognize and click on the audit checkbox, then wait until new page loads prior to deownloaing the new page. Here are some links that will help:

how to locate elements
Click on Checkbox
pageLoadStrategy

Thanks! So I used those sources to create the below code. I'm getting a blank csv though. Not sure what I'm doing wrong.

import requests
from bs4 import BeautifulSoup
import csv

url = "https://oig.hhs.gov/reports-and-publications/all-reports-and-publications/"
response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

reports = soup.find_all("div", class_="media")

report_data = []

for report in reports:
    title = report.find("h3").get_text(strip=True)
    audit = report.find("span", class_="audit").get_text(strip=True) if report.find("span", class_="audit") else "N/A"
    agency = report.find("span", class_="agency").get_text(strip=True) if report.find("span", class_="agency") else "N/A"
    date = report.find("span", class_="date").get_text(strip=True) if report.find("span", class_="date") else "N/A"
    
    report_data.append({
        "Title": title,
        "Audit": audit,
        "Agency": agency,
        "Date": date
    })

# Export to CSV
csv_file = "reports_data.csv"
with open(csv_file, mode='w', newline='', encoding='utf-8') as file:
    writer = csv.DictWriter(file, fieldnames=["Title", "Audit", "Agency", "Date"])
    writer.writeheader()
    for data in report_data:
        writer.writerow(data)

print(f"Data exported to {csv_file}")
Reply


Messages In This Thread
Can't figure out how to scrape grid - by templeowls - Jun-13-2024, 05:14 PM
RE: Can't figure out how to scrape grid - by templeowls - Jun-18-2024, 12:57 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  scrape data 1 go to next page scrape data 2 and so on alkaline3 6 5,606 Mar-13-2020, 07:59 PM
Last Post: alkaline3

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020