Running A Loop Until You See A Particular Result - Printable Version +- Python Forum (https://python-forum.io) +-- Forum: Python Coding (https://python-forum.io/forum-7.html) +--- Forum: General Coding Help (https://python-forum.io/forum-8.html) +--- Thread: Running A Loop Until You See A Particular Result (/thread-34789.html) |
Running A Loop Until You See A Particular Result - knight2000 - Sep-01-2021 Hi guys, I've been learning about rotating proxies and have found myself a little stuck and after many many hours have passed, I thought it was time to reach out for some assistance. In a nutshell, I've got a list of proxies and I want pick a random one from a list for each request. If the random list finds a proxy that works, the code below works correctly. If it tries to use a proxy that doesn't work, I get: I understand that the error is the proxy not working(I tested it with several working proxies to verify the problem), so what I'm trying to do, is to run a loop to find a random proxy from my list each time a request is made. So I've got: from bs4 import BeautifulSoup import requests import random url = ‘testurl.com’ proxy_list = ['173.68.59.131:3128','64.124.38.139:8080','69.197.181.202:3128'] proxies = random.choice(proxy_list) response = requests.get(url, headers=headers, proxies={'https': proxies}, timeout=3) if response.status_code == 200: print(response.status_code) elif response.status_code != 200: proxies = random.choice(proxy_list) response = requests.get(url, headers=headers, proxies={'https': proxies}, timeout=3)(At the moment, the code is simply printing a response code of 200 if it's successful, but I'll be changing that later to get html information.) But anyway, my goal of the above code is to grab a random proxy from the list, test it to check if it works and if it does, do the request. Alteratively if it doesn't, keep randomly looping through the proxy list until it can find a working proxy- and then go ahead and complete the request. Can anyone please enlighten me how this can be done? Thanks a lot. RE: Running A Loop Until You See A Particular Result - menator01 - Sep-01-2021 You might can use a try except clause. Something like. Code not tested. #! /usr/bin/env python3 import requests as rq import random as rnd import copy url = 'testurl.com' proxy_list = ['173.68.59.131:3128','64.124.38.139:8080','69.197.181.202:3128'] proxy_copy = copy.deepcopy(proxy_list) while proxy_copy: rnd.shuffle(proxy_copy) proxy = proxy_copy.pop() try: response = rq.get(url, headers=headers, proxies={'https':proxy}, timeout=3) print(response.status_code) except NameError as error: print(error) continue RE: Running A Loop Until You See A Particular Result - knight2000 - Sep-01-2021 Hi Menator01, Thank you for taking the time to give me that detailed solution. I've never heard of the copy module so that was interesting to see. I did try various other attempts with 'try's' and 'if' statements but I couldn't get it to work! I tried your potential solution- it definitely seems to continue to run through to find the next proxy if the current one doesn't appear to work ...but the problem is that when it does find a working proxy (response = 200), it still continues to check every other proxy anyway. So if my url was https://google.com and let's say I have 30 successful proxies that work, once the code finds the first successful proxy, it will continue to hit google.com another 30 times even through it found a proxy that worked earlier! Essentially the code needs to look for one random proxy and if it's successful, it should go through with the request and stop. If the proxy it randomly picks is dead, it should keep looping through until it finds a working proxy, run the request once and stop. I'm wondering if it requires something like an IF statement somewhere or something else requires a change? (Sep-01-2021, 08:07 AM)menator01 Wrote: You might can use a try except clause. Something like. Code not tested. RE: Running A Loop Until You See A Particular Result - ibreeden - Sep-01-2021 (Sep-01-2021, 10:18 AM)knight2000 Wrote: but the problem is that when it does find a working proxy (response = 200), it still continues to check every other proxy anywayThen add a break statement to exit the while loop after a successful connection. And by the way, why do you want a random proxy? It might happen a false proxy is tried more than one time. It seems better to try the proxies in sequence. You may even try to change the order so the unsuccessful proxies are moved to the end. RE: Running A Loop Until You See A Particular Result - DeaD_EyE - Sep-01-2021 Some improvements + error corrections + info about urls.. import random import sys import time import requests # Take the right protocol # "testurl.com" is not a valid URL # "http://testurl.com" is valid url = "https://python-forum.io/thread-34789.html" # set headers # this was missing in the code example and this was causing the # NameError headers = {} # Proxies must also start with http:// or https:// proxies = [ "http://173.68.59.131:3128", "http://64.124.38.139:8080", "http://69.197.181.202:3128", ] random.shuffle(proxies) result = None for proxy in proxies: try: response = requests.get( url, headers=headers, proxies={"https": proxy}, timeout=3 ) except (requests.ReadTimeout, requests.ConnectionError): print("Got timeout", file=sys.stderr) continue except Exception as e: print("Contact the programer", repr(e), file=sys.stderr) else: print(response.status_code, file=sys.stderr) # be a good shell citizen # don't print debugging data to stdout if response.status_code == 200: result = response.text # break out of loop if result was found break if result is None: print("No success", file=sys.stderr) else: time.sleep(2) print(result) RE: Running A Loop Until You See A Particular Result - knight2000 - Sep-04-2021 (Sep-01-2021, 12:15 PM)ibreeden Wrote:(Sep-01-2021, 10:18 AM)knight2000 Wrote: but the problem is that when it does find a working proxy (response = 200), it still continues to check every other proxy anywayThen add a break statement to exit the while loop after a successful connection. Thank ibreeden. The reason for a random proxy is that I watched several videos and read several blog posts and a few mentioned that it's better practice to randomize your list to avoid potential footprints- recommended, but not required. You're right about it trying a false proxy more than once and in my testing over the last few days, I've noticed that proxies can die within minutes, so your list pool is constantly changing. RE: Running A Loop Until You See A Particular Result - knight2000 - Sep-04-2021 (Sep-01-2021, 02:20 PM)DeaD_EyE Wrote: Hi DeaD_EyE, |