Apr-19-2018, 06:49 PM
(Apr-19-2018, 05:43 PM)gentoobob Wrote: Let me ask you, will I have to have a separate "soup" variable for each URL or can I cram all the pages into one variable?You pass url to Requests and BS in same loop.
Example:
start = 1 stop = 5 for page in range(start, stop): url = 'https://10.10.10.0/vmrest/users?rowsPerPage=2000&pageNumber={}'.format(page) url_get = requests.get(url) soup = BeautifulSoup(url_get.content, 'lxml') foo = soup.find('do scraping') # save foo
Quote:or can I cram all the pages into one variable?Not variable but a data structure like eg list,you can collect urls.
from pprint import pprint urls = [] start = 1 stop = 5 for page in range(start, stop): url = 'https://10.10.10.0/vmrest/users?rowsPerPage=2000&pageNumber={}'.format(page) urls.append(url) pprint(urls)
Output:['https://10.10.10.0/vmrest/users?rowsPerPage=2000&pageNumber=1',
'https://10.10.10.0/vmrest/users?rowsPerPage=2000&pageNumber=2',
'https://10.10.10.0/vmrest/users?rowsPerPage=2000&pageNumber=3',
'https://10.10.10.0/vmrest/users?rowsPerPage=2000&pageNumber=4']