Python Forum

Full Version: HTTP request inside a for loop.
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I am writing a web scraping script where I have a list of urls to scrap and I iterate through these urls using a for loop. These urls will return the same page structure but with different data hence I am using the same code to scrap all these urls. My problem is that when I run the script, it returns a 404 html error for all the urls in the list and I think its because the for loop is running faster than the 'session.get()' can return the pages. I may be wrong, please advice what you think the problem is. Please see code below.

session = requests.session()
url_list = list(current_urls_set)
if len(url_list) > 0:
   payload = {
       'UserName': 'myemail',
       'Password': 'mypassword'
   }
   session.post('website url here', data=payload)
   rfq_dir = 'C:/Projects/TenderBot_Python/Tenders_and_RFQs/cpt/RFQs/{}'.format(datetime.today().strftime('%d-%m-%Y'))
   if not os.path.exists(rfq_dir):
       os.mkdir(rfq_dir)
   for url in url_list:
       data_list = url.split(',')
       closing_date = datetime.strptime(data_list[1], '%m/%d/%Y %I:%M:%S %p')
       if closing_date > datetime.now():
           rfqHtml = session.get(websit url here{}'.format(data_list[3].strip()))
           print(rfqHtml.text)
Never mind this, I had the urls wrong.