Python Forum
Problem installing instaloader - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Problem installing instaloader (/thread-22055.html)

Pages: 1 2 3


RE: Problem installing instaloader - ledgreve - Nov-05-2019

@snippsat
I used the code you sent me and it worked, but as soon as I changed the hashtag or datetime, it stopped working. Let's say I want all posts between 1 Jan. 2007 and 31. Dec. 2017 with the hashtag "bookerprize":

from datetime import datetime
from itertools import dropwhile, takewhile
 
import instaloader
 
L = instaloader.Instaloader()
 
posts = L.get_hashtag_posts('bookerprize')
# or
# posts = instaloader.Profile.from_username(L.context, PROFILE).get_posts()
 
SINCE = datetime(2007, 1, 1)
UNTIL = datetime(2017, 12, 31)
 
for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, posts)):
    print(post.date)
    L.download_post(post, '#bookerprize')
but it doesn't do anything. It takes an eternity for the "In [*]:" to change to "In [2]:" and when it does I get the following error:

Error:
JSON Query to explore/tags/bookerprize/: 429 Too Many Requests [retrying; skip with ^C] JSON Query to explore/tags/bookerprize/: 429 Too Many Requests [retrying; skip with ^C] JSON Query to explore/tags/bookerprize/: HTTP error code 502. [retrying; skip with ^C] --------------------------------------------------------------------------- TooManyRequestsException Traceback (most recent call last) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 373 if resp.status_code == 429: --> 374 raise TooManyRequestsException("429 Too Many Requests") 375 if resp.status_code != 200: TooManyRequestsException: 429 Too Many Requests During handling of the above exception, another exception occurred: TooManyRequestsException Traceback (most recent call last) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 373 if resp.status_code == 429: --> 374 raise TooManyRequestsException("429 Too Many Requests") 375 if resp.status_code != 200: TooManyRequestsException: 429 Too Many Requests During handling of the above exception, another exception occurred: TooManyRequestsException Traceback (most recent call last) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 373 if resp.status_code == 429: --> 374 raise TooManyRequestsException("429 Too Many Requests") 375 if resp.status_code != 200: TooManyRequestsException: 429 Too Many Requests The above exception was the direct cause of the following exception: ConnectionException Traceback (most recent call last) <ipython-input-11-afa213953ceb> in <module> 13 UNTIL = datetime(2008, 12, 31) 14 ---> 15 for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, posts)): 16 print(post.date) 17 L.download_post(post, '#bookerprize') ~\Anaconda\lib\site-packages\instaloader\instaloader.py in get_hashtag_posts(self, hashtag) 831 params = {'__a': 1} 832 hashtag_data = self.context.get_json('explore/tags/{0}/'.format(hashtag), --> 833 params)['graphql']['hashtag']['edge_hashtag_to_media'] 834 yield from (Post(self.context, edge['node']) for edge in hashtag_data['edges']) 835 has_next_page = hashtag_data['page_info']['has_next_page'] ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 411 if is_iphone_query and isinstance(err, TooManyRequestsException): 412 self._ratecontrol_graphql_query('iphone', untracked_queries=True) --> 413 return self.get_json(path=path, params=params, host=host, session=sess, _attempt=_attempt + 1) 414 except KeyboardInterrupt: 415 self.error("[skipped by user]", repeat_at_end=False) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 411 if is_iphone_query and isinstance(err, TooManyRequestsException): 412 self._ratecontrol_graphql_query('iphone', untracked_queries=True) --> 413 return self.get_json(path=path, params=params, host=host, session=sess, _attempt=_attempt + 1) 414 except KeyboardInterrupt: 415 self.error("[skipped by user]", repeat_at_end=False) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 404 error_string = "JSON Query to {}: {}".format(path, err) 405 if _attempt == self.max_connection_attempts: --> 406 raise ConnectionException(error_string) from err 407 self.error(error_string + " [retrying; skip with ^C]", repeat_at_end=False) 408 try: ConnectionException: JSON Query to explore/tags/bookerprize/: 429 Too Many Requests
and I get no actual output. I have never had this problem with the command line version, but to be sure I used a different hashtag of which I know there are a lot less posts, but either I got the same error message, or I simply didn't get anything at all - no output, no error message. I tried to read their "troubleshooting" essay, but I don't understand it - I think my English may not be good enough when it comes to the specific terminology used. Would you please help and explain it to me? How can I solve this problem?


RE: Problem installing instaloader - buran - Nov-05-2019

maybe check
https://instaloader.github.io/troubleshooting.html
https://github.com/instaloader/instaloader/issues/128
https://github.com/instaloader/instaloader/issues/392


RE: Problem installing instaloader - ledgreve - Nov-05-2019

@buran
Thank you for sending me this, but the problem is that I don't really understand what it means. I am very new to programming and I don't understand most of the terminology yet. I especially don't understand why I am having problems now, even when trying to extract the posts containing a hashtag which I know is used in only 17 posts, because I didn't have any problem extracting about 900 posts by using instaloader in the command line instead of as a Python module.


RE: Problem installing instaloader - buran - Nov-05-2019

in simple terms - you are about to exceed the rate limit set by Instagram. Pay attention to red text in quote:

Quote:429 - Too Many Requests

Instaloader has a logic to keep track of its requests to Instagram and to obey their rate limits. Since they are nowhere documented, we try them out experimentally. We have a daily cron job running to confirm that Instaloader still stays within the rate limits. Nevertheless, the rate control logic assumes that

at one time, Instaloader is the only application that consumes requests. I.e. neither the Instagram browser interface, nor a mobile app, nor another Instaloader instance is running in parallel,

no requests had been consumed when Instaloader starts.

The latter one implies that restarting or reinstantiating Instaloader often within short time is prone to cause a 429. When a request is denied with a 429, Instaloader retries the request as soon as the temporary ban is assumed to be expired. In case the retry continuously fails for some reason, which should not happen in normal conditions, consider adjusting the --max-connection-attempts option.

“Too many queries in the last time” is not an error. It is a notice that the rate limit has almost been reached, according to Instaloader’s own rate accounting mechanism. We regularly adjust this mechanism to match Instagram’s current rate limiting.



RE: Problem installing instaloader - ledgreve - Nov-05-2019

@buran

Thank you!
Then how do I know what their rate limit is and why don't I ever have this problem when using the command line? And what do they mean with "restarting or reinstantiating Instaloader"? Is that every time I open my Jupyter Notebook in which the instaloader module was installed? Or does this mean re-running the cell used to install instaloader in the notebook?


RE: Problem installing instaloader - baquerik - Nov-05-2019

Have no clue about instaloader or how it works, but here a couple of ideas:

· In order to see their limit, you could put a counter and see after how many iterations you get this error message.
· If they have a limitation based on time (lets say 1000 per 10 seconds), you could use sleep() so it takes some more time between the iterations.


RE: Problem installing instaloader - ledgreve - Nov-05-2019

@baguerik
Thank you for the tip! How can I set up a counter?
I just started using python, so I am rather new to programming and using code.


RE: Problem installing instaloader - baquerik - Nov-05-2019

Well, there are many ways of doing it. You could simply update a variable in every iteration and show the number catching the error when it occurs:

counter = 0 

try:
    for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, posts)):
        print(post.date)
        L.download_post(post, '#bookerprize')
        counter += 1

except:
    print(f"Loop failed after {counter} iterations")

else:
    print("Loop finished with no errors!")



RE: Problem installing instaloader - buran - Nov-05-2019

(Nov-05-2019, 10:27 AM)ledgreve Wrote: Then how do I know what their rate limit is
please, read carefully
Quote:Instaloader has a logic to keep track of its requests to Instagram and to obey their rate limits. Since they are nowhere documented, we try them out experimentally.

(Nov-05-2019, 10:27 AM)ledgreve Wrote: Is that every time I open my Jupyter Notebook in which the instaloader module was installed? Or does this mean re-running the cell used to install instaloader in the notebook?
I don't use Jupyter, so I don't know what happens when you open the notebook. Definitely when you rerun the cell. And also you use another instance via cmd which is violation of their assumptions for calculating the rate limit
Quote:at one time, Instaloader is the only application that consumes requests. I.e. neither the Instagram browser interface, nor a mobile app, nor another Instaloader instance is running in parallel,

no requests had been consumed when Instaloader starts.



RE: Problem installing instaloader - buran - Nov-05-2019

if you google there is various guesses about the rate limit. For example as of Oct 2018 they suggest 200 per hour: https://support.sproutsocial.com/hc/en-us/articles/360018137651-Instagram-Rate-Limit-Change-Oct-2018-