Python Forum

Full Version: Creating a simple rate limiter.
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I have been doing some stuff with reddit for a while now. I haven't used an api key with it, since I wasnt to make the program as simple to set up as possible. Of course, that means I'm getting rate serious limiting. This is probably caused by reddit's api blocking useragents like 'Python/urrlib'. I found Reddit's api rules and obviously I must follow these rules. (Things like changing the useragent, I have done). The hardest part is having to add a rate limiter to my program since it was never designed to have one, meaning I'm having to bodge one in that fits around my already working program. It wasn't too hard to make but it doesn't quite work. Here's a simplified version of my main script, with the rate limiter:
def counter(func): #counts how many times function was called
	def inner1(*args, **kwargs):
		inner1.calls += 1
		return func(*args, **kwargs)
	inner1.calls = 0
	inner1.__name__= func.__name__
	return inner1

class Links():
	secs = 0
	isLimited = False
	reddit_calls = 0
	
	def create_timer():
		thread = Thread(target = Links.timer)
		thread.start()

	def timer():
		while Links.secs != 60:
			time.sleep(2)
			Links.secs += 1
			if (Links.secs >= 60 and not Links.isLimited):
				Links.secs = 0

	def rate_limit():
		Links.isLimited = True
		Links.reddit_calls = 0
		Links.timer() #waits 60 secs without resetting
		Links.isLimited = False
		thread = Thread(target = Links.timer)
		thread.start()

	def parse_link(link):
		if(link == "reddit" and Links.reddit_calls < 60 and Links.secs < 60):
			if(not Links.isLimited):
				print(Links.Reddit.reddit_url(link), Links.reddit_calls,)
		elif(Links.reddit_calls >= 60 and Links.secs <= 60):
			print("rate limited!!")
			Links.rate_limit()
		elif(link == "somethingelse"):
			print(link)

	class Reddit():
		@counter
		def reddit_url(parsed):
			#do some stuff with the link
			Links.reddit_calls = Links.Reddit.reddit_url.calls
			return parsed

Links.create_timer()

for i in range(65):
	Links.parse_link("reddit")
	Links.parse_link("somethingelse")
The output looks like this:
Output:
reddit 1 somethingelse reddit 2 somethingelse reddit 3 somethingelse reddit 4 somethingelse reddit 5 somethingelse reddit 6 somethingelse reddit 7 somethingelse reddit 8 somethingelse reddit 9 somethingelse reddit 10 somethingelse reddit 11 somethingelse reddit 12 somethingelse ...... #all the way up until 60 rate limited!!!!
What it should do:
Output:
reddit 1 somethingelse reddit 2 somethingelse reddit 3 somethingelse reddit 4 somethingelse reddit 5 somethingelse reddit 6 somethingelse reddit 7 somethingelse reddit 8 somethingelse reddit 9 somethingelse reddit 10 somethingelse reddit 11 somethingelse reddit 12 somethingelse ...... #all the way up until 60 rate limited!!!! somethingelse somethingelse #it should print 'somethingelse' 5 more times somethingelse somethingelse somethingelse
The main difference is that the threading isn't working. I basically need to be processing the reddit links on their own thread. At the moment, if reddit links get rate limited then every link is rate limited. I need it to rate limit just the reddit links, hence why 'somethingelse' else should appear 5 more times because that isn't a reddit link, and it shouldn't get rate limited.
What do I need to change to fix this?
(Threading may not be the best approach but really it's probably the easiest way to implement it without having to rewrite the whole 'Links' class.)

EDIT: I should probably do my best to explain what the script does.
It starts by creating a 'timer' thread. This timer thread activates a while loop that count for 60 seconds, then resets back to 0 if it hits 60.
Every time the 'reddit_url' function is called, it adds 1 to the call count.
The rate limiting occurs if the function has been called 60 times, in 60 seconds or less. If it hits this limit, the function 'rate_limit' is called. This sets the 'islimited' variable to true, to stop the 'reddit_url' function from running (since I set call count back to 0). It then calls the 'timer' function again. This counts up for 60 seconds but rather than resetting to 0, it just returns (since 'isLimited' is true, it wont pass the if statement)
Once the 60 seconds is up, 'isLimited' becomes false, and it creates the timer thread again to start the whole process again.