Jun-07-2018, 10:07 PM
Hello All,
I have a website that as 26 pages, that star with 'a' and end with a 'z'.
this is y\the url of the site https://www.usa.gov/federal-agencies/a
I have a scraper that does what I want. I know to all of you python kings
it will be crude. what I need help on is how to scrape all 26 pages.
I have been all over the net looking for how to do it. Just not much out there.
I have found a few way of doing it, but none work. So here I am hoping someone can help.
here is my code
here that works a bit. it print out the same page 26 times.
Thank you
renny
I have a website that as 26 pages, that star with 'a' and end with a 'z'.
this is y\the url of the site https://www.usa.gov/federal-agencies/a
I have a scraper that does what I want. I know to all of you python kings
it will be crude. what I need help on is how to scrape all 26 pages.
I have been all over the net looking for how to do it. Just not much out there.
I have found a few way of doing it, but none work. So here I am hoping someone can help.
here is my code
#Python 3.7 from html.parser import HTMLParser import requests from bs4 import BeautifulSoup r = requests.get('https://www.usa.gov/federal-agencies/a') first_page = r.text soup = BeautifulSoup(first_page, 'html.parser') page_soup = soup #page_soup.h1 #page_soup.p boxes = page_soup.find_all('ul', {'class' : 'one_column_bullet'}) boxes[0].text.strip() print(boxes)I tryed all I could think of mostly many for loop.
here that works a bit. it print out the same page 26 times.
#Python 3.7 from html.parser import HTMLParser import requests from bs4 import BeautifulSoup from string import ascii_lowercase for letter in ascii_lowercase: r = requests.get('https://www.usa.gov/federal-agencies/' + letter +' ') first_page = r.text soup = BeautifulSoup(first_page, 'html.parser') page_soup = soup.find('h1') print(page_soup)So if some one know how to use my to scrape 26 pages let me know.
Thank you
renny