Need help opening pages when web scraping

***snippsat*** · (This post was last modified: Feb-29-2024, 06:45 PM by snippsat.)

(Feb-26-2024, 08:16 PM)templeowls Wrote: It works well but I also want it to open each url and scrape the full text on the page for each. Any suggestions on how to modify this code to achieve?

It would be messy and try to integrate it in code you already have.
Tips make it work separate first or keep all separate then add to csv at end.
So most eg as i start with here make complete links for all articles.

import requests
from bs4 import BeautifulSoup

page_nr = 1
url = f"https://www.eeoc.gov/newsroom/search?page={page_nr}"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
all_link = soup.select('article > h2 > a')
all_href = [a['href'] for a in all_link]
# Make complete links
base_url = 'https://www.eeoc.gov'
news_links = []
for link in all_href:
    print(f'{base_url}{link}')
    news_links.append(f'{base_url}{link}')

Output:https://www.eeoc.gov/newsroom/tc-wheelers-pay-25000-settle-eeoc-sex-harassment-lawsuit
https://www.eeoc.gov/newsroom/trinity-health-michigan-pay-50000-settle-eeoc-religious-discrimination-lawsuit
https://www.eeoc.gov/newsroom/cash-depot-pays-55000-settle-eeoc-disability-discrimination-lawsuit
https://www.eeoc.gov/newsroom/nebraska-court-orders-trucking-company-pay-deaf-driver-punitive-damages-lost-wages-after
......

Now that this is done can iterate over news_links and open in requests/BS and parser articles.

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	opening your scraping with a csv file in excel??	Nobelium	0	1,489	Jan-27-2021, 02:31 PM Last Post: Nobelium
	scraping multiple pages from table	bandar	1	2,779	Jun-27-2020, 10:43 PM Last Post: Larz60+
	Scraping Multiple Pages	mbadatanut	1	4,303	May-08-2020, 02:30 AM Last Post: Larz60+
	Scraping not moving to the next pages in a website	jithin123	0	2,022	Mar-23-2020, 06:10 PM Last Post: jithin123
	Web Page not opening while web scraping through python selenium	sumandas89	4	10,230	Nov-19-2018, 02:47 PM Last Post: snippsat
	Scraping external URLs from pages	Apook	5	4,288	Jul-18-2018, 06:42 PM Last Post: nilamo
	scraping multiple pages of a website.	Blue Dog	14	22,670	Jun-21-2018, 09:03 PM Last Post: Blue Dog

Need help opening pages when web scraping

User Panel Messages

Announcements