I want to be able to extract data from multiple pages. The pages are in the following format:
I have created code so far that exports into a results into a csv file. However this only works for 1 url:
I could create a textfile with the possible links but still not sure what to do to get this to work
I'm new to python
Output:https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=1&sort_order=price_asc
https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=2&sort_order=price_asc
https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=3&sort_order=price_asc
In these links the only thing that changes in the url is the number following page=I have created code so far that exports into a results into a csv file. However this only works for 1 url:
from urllib.request import urlopen as uReq from bs4 import BeautifulSoup as soup my_url = 'https://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&134=9&135=2&rptpath=350-5748-&rsqid=d4360a620e944164b321dc2498f327b9-002&nofilters=1&originalsidebar=1&key=1227701521&page=1&sort_order=price_asc' # opening up connection, grabbing the page uClient = uReq(my_url) page_html = uClient.read() uClient.close() # html parser page_soup = soup(page_html, "html.parser") # grabs each property listings = page_soup.findAll("div",{"class":"tmp-search-card-list-view__card-content"}) filename = "trademe.csv" f = open(filename, "w") headers = "title, price, area\n" f.write(headers) for listing in listings: title_listing = listing.findAll("div", {"class":"tmp-search-card-list-view__title"}) price_listing = listing.findAll("div", {"class":"tmp-search-card-list-view__price"}) area_listing = listing.findAll("div", {"class":"tmp-search-card-list-view__subtitle"}) title = title_listing[0].text.strip() price = price_listing[0].text.strip() area = area_listing[0].text.strip() print("title: " + title) print("price: " + price) print("area: " + area) f.write(title.replace(",", "^") + "," + price.replace(",", "") + "," + area.replace(",", "^") + "\n") f.close()How would I get these working so that it keeps going through all the numbers of urls?
I could create a textfile with the possible links but still not sure what to do to get this to work
I'm new to python