Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Looping through multiple pages with changing url
#1
Hi, I am new to web scraping and have just managed to write my first working script. However it is only able to extract data from the first page. I have not been able to apply solutions offered online successfully. Will be might glad if someone can assist me write a complete scrip that extracts data from all pages. below is my current working script


from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url = 'https://www.merchantcircle.com/search?q=self-storage&qn='

#opens connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

#page parser
page_soup = soup(page_html, "html.parser")

businesses = page_soup.findAll("div",{"class":"hInfo vcard"})


filename = "storage.csv"
f = open(filename, "w")

#headers = "brand, product_name, price, shipping\n"
headers = "biz_name, biz_address, biz_phone_num\n"

f.write(headers)

for business in businesses:
		

	#grabs business name
	biz_name = business.h2.a.text.strip()

	#grabs business address
	address = business.find("a",{"class":"directions"})
	biz_address = address.text.strip()	


	#grabs phone number
	phone_num = business.find("a",{"class":"phone digits tel"})
	biz_phone_num = phone_num.text.strip()



	print("biz_name: " + biz_name)
	print("biz_address: " + biz_address)
	print("biz_phone_num: " + biz_phone_num)

	f.write(biz_name + "," + biz_address.replace(",", "|") + "," + biz_phone_num + "\n")

f.close()

Quote
#2
I'm not sure what links you are interested in, but using your first result 'businesses' on line 14 (for example)
You can pull the link to kabbage by adding after line 14
next_url = businesses.h2.a.get('href')
and review link for same using review_url = businesses.div.a.get('href')
I'd also use requests rather that urllib.
see examples in the following two threads:
web scraping part1
web scraping part2
Quote
#3
Oh, sorry for the omission. I'm interested in the page links so that I'm able to extract all 65000+ records. Currently only able to extract 21 records in page one.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  How to handle tables splitted across multiple web pages ankitjindalbti 2 383 Jun-02-2019, 07:33 AM
Last Post: ankitjindalbti
  scraping multiple pages of a website. Blue Dog 14 13,656 Jun-21-2018, 09:03 PM
Last Post: Blue Dog
  Looping and naming multiple files revo 4 880 Mar-25-2018, 09:46 PM
Last Post: Larz60+

Forum Jump:


Users browsing this thread: 1 Guest(s)