Python Forum
Problem installing instaloader
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem installing instaloader
#11
@snippsat
I used the code you sent me and it worked, but as soon as I changed the hashtag or datetime, it stopped working. Let's say I want all posts between 1 Jan. 2007 and 31. Dec. 2017 with the hashtag "bookerprize":

from datetime import datetime
from itertools import dropwhile, takewhile
 
import instaloader
 
L = instaloader.Instaloader()
 
posts = L.get_hashtag_posts('bookerprize')
# or
# posts = instaloader.Profile.from_username(L.context, PROFILE).get_posts()
 
SINCE = datetime(2007, 1, 1)
UNTIL = datetime(2017, 12, 31)
 
for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, posts)):
    print(post.date)
    L.download_post(post, '#bookerprize')
but it doesn't do anything. It takes an eternity for the "In [*]:" to change to "In [2]:" and when it does I get the following error:

Error:
JSON Query to explore/tags/bookerprize/: 429 Too Many Requests [retrying; skip with ^C] JSON Query to explore/tags/bookerprize/: 429 Too Many Requests [retrying; skip with ^C] JSON Query to explore/tags/bookerprize/: HTTP error code 502. [retrying; skip with ^C] --------------------------------------------------------------------------- TooManyRequestsException Traceback (most recent call last) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 373 if resp.status_code == 429: --> 374 raise TooManyRequestsException("429 Too Many Requests") 375 if resp.status_code != 200: TooManyRequestsException: 429 Too Many Requests During handling of the above exception, another exception occurred: TooManyRequestsException Traceback (most recent call last) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 373 if resp.status_code == 429: --> 374 raise TooManyRequestsException("429 Too Many Requests") 375 if resp.status_code != 200: TooManyRequestsException: 429 Too Many Requests During handling of the above exception, another exception occurred: TooManyRequestsException Traceback (most recent call last) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 373 if resp.status_code == 429: --> 374 raise TooManyRequestsException("429 Too Many Requests") 375 if resp.status_code != 200: TooManyRequestsException: 429 Too Many Requests The above exception was the direct cause of the following exception: ConnectionException Traceback (most recent call last) <ipython-input-11-afa213953ceb> in <module> 13 UNTIL = datetime(2008, 12, 31) 14 ---> 15 for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, posts)): 16 print(post.date) 17 L.download_post(post, '#bookerprize') ~\Anaconda\lib\site-packages\instaloader\instaloader.py in get_hashtag_posts(self, hashtag) 831 params = {'__a': 1} 832 hashtag_data = self.context.get_json('explore/tags/{0}/'.format(hashtag), --> 833 params)['graphql']['hashtag']['edge_hashtag_to_media'] 834 yield from (Post(self.context, edge['node']) for edge in hashtag_data['edges']) 835 has_next_page = hashtag_data['page_info']['has_next_page'] ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 411 if is_iphone_query and isinstance(err, TooManyRequestsException): 412 self._ratecontrol_graphql_query('iphone', untracked_queries=True) --> 413 return self.get_json(path=path, params=params, host=host, session=sess, _attempt=_attempt + 1) 414 except KeyboardInterrupt: 415 self.error("[skipped by user]", repeat_at_end=False) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 411 if is_iphone_query and isinstance(err, TooManyRequestsException): 412 self._ratecontrol_graphql_query('iphone', untracked_queries=True) --> 413 return self.get_json(path=path, params=params, host=host, session=sess, _attempt=_attempt + 1) 414 except KeyboardInterrupt: 415 self.error("[skipped by user]", repeat_at_end=False) ~\Anaconda\lib\site-packages\instaloader\instaloadercontext.py in get_json(self, path, params, host, session, _attempt) 404 error_string = "JSON Query to {}: {}".format(path, err) 405 if _attempt == self.max_connection_attempts: --> 406 raise ConnectionException(error_string) from err 407 self.error(error_string + " [retrying; skip with ^C]", repeat_at_end=False) 408 try: ConnectionException: JSON Query to explore/tags/bookerprize/: 429 Too Many Requests
and I get no actual output. I have never had this problem with the command line version, but to be sure I used a different hashtag of which I know there are a lot less posts, but either I got the same error message, or I simply didn't get anything at all - no output, no error message. I tried to read their "troubleshooting" essay, but I don't understand it - I think my English may not be good enough when it comes to the specific terminology used. Would you please help and explain it to me? How can I solve this problem?
Reply
#12
maybe check
https://instaloader.github.io/troubleshooting.html
https://github.com/instaloader/instaloader/issues/128
https://github.com/instaloader/instaloader/issues/392
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#13
@buran
Thank you for sending me this, but the problem is that I don't really understand what it means. I am very new to programming and I don't understand most of the terminology yet. I especially don't understand why I am having problems now, even when trying to extract the posts containing a hashtag which I know is used in only 17 posts, because I didn't have any problem extracting about 900 posts by using instaloader in the command line instead of as a Python module.
Reply
#14
in simple terms - you are about to exceed the rate limit set by Instagram. Pay attention to red text in quote:

Quote:429 - Too Many Requests

Instaloader has a logic to keep track of its requests to Instagram and to obey their rate limits. Since they are nowhere documented, we try them out experimentally. We have a daily cron job running to confirm that Instaloader still stays within the rate limits. Nevertheless, the rate control logic assumes that

at one time, Instaloader is the only application that consumes requests. I.e. neither the Instagram browser interface, nor a mobile app, nor another Instaloader instance is running in parallel,

no requests had been consumed when Instaloader starts.

The latter one implies that restarting or reinstantiating Instaloader often within short time is prone to cause a 429. When a request is denied with a 429, Instaloader retries the request as soon as the temporary ban is assumed to be expired. In case the retry continuously fails for some reason, which should not happen in normal conditions, consider adjusting the --max-connection-attempts option.

“Too many queries in the last time” is not an error. It is a notice that the rate limit has almost been reached, according to Instaloader’s own rate accounting mechanism. We regularly adjust this mechanism to match Instagram’s current rate limiting.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#15
@buran

Thank you!
Then how do I know what their rate limit is and why don't I ever have this problem when using the command line? And what do they mean with "restarting or reinstantiating Instaloader"? Is that every time I open my Jupyter Notebook in which the instaloader module was installed? Or does this mean re-running the cell used to install instaloader in the notebook?
Reply
#16
Have no clue about instaloader or how it works, but here a couple of ideas:

· In order to see their limit, you could put a counter and see after how many iterations you get this error message.
· If they have a limitation based on time (lets say 1000 per 10 seconds), you could use sleep() so it takes some more time between the iterations.
Reply
#17
@baguerik
Thank you for the tip! How can I set up a counter?
I just started using python, so I am rather new to programming and using code.
Reply
#18
Well, there are many ways of doing it. You could simply update a variable in every iteration and show the number catching the error when it occurs:

counter = 0 

try:
    for post in takewhile(lambda p: p.date > UNTIL, dropwhile(lambda p: p.date > SINCE, posts)):
        print(post.date)
        L.download_post(post, '#bookerprize')
        counter += 1

except:
    print(f"Loop failed after {counter} iterations")

else:
    print("Loop finished with no errors!")
Reply
#19
(Nov-05-2019, 10:27 AM)ledgreve Wrote: Then how do I know what their rate limit is
please, read carefully
Quote:Instaloader has a logic to keep track of its requests to Instagram and to obey their rate limits. Since they are nowhere documented, we try them out experimentally.

(Nov-05-2019, 10:27 AM)ledgreve Wrote: Is that every time I open my Jupyter Notebook in which the instaloader module was installed? Or does this mean re-running the cell used to install instaloader in the notebook?
I don't use Jupyter, so I don't know what happens when you open the notebook. Definitely when you rerun the cell. And also you use another instance via cmd which is violation of their assumptions for calculating the rate limit
Quote:at one time, Instaloader is the only application that consumes requests. I.e. neither the Instagram browser interface, nor a mobile app, nor another Instaloader instance is running in parallel,

no requests had been consumed when Instaloader starts.
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply
#20
if you google there is various guesses about the rate limit. For example as of Oct 2018 they suggest 200 per hour: https://support.sproutsocial.com/hc/en-u...e-Oct-2018-
If you can't explain it to a six year old, you don't understand it yourself, Albert Einstein
How to Ask Questions The Smart Way: link and another link
Create MCV example
Debug small programs

Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
Photo Problem installing turtle MasterJediKnight7 17 24,689 Mar-18-2024, 10:22 AM
Last Post: bmohamadyar313
  [WORKED AROUND] Problem installing elitech-datareader, 'cannot import build_py_2to3' NeilUK 4 1,724 Jul-09-2023, 10:01 AM
Last Post: NeilUK
  Problem Installing rasterio gw1500se 1 2,207 Mar-24-2020, 06:28 PM
Last Post: gw1500se
  Problem installing library thunderspeed 2 2,332 Mar-22-2020, 11:04 PM
Last Post: thunderspeed
  Please help: problem installing/importing langdetect module in Jupyter Notebook ledgreve 3 7,304 Dec-30-2019, 08:17 AM
Last Post: LeanbridgeTech
  Problem with installing PyPDF2 Pavel_47 2 6,038 Nov-10-2019, 02:58 PM
Last Post: Pavel_47
  Problem installing numpy and matplotlib achondrite 1 3,136 Jan-16-2019, 11:43 PM
Last Post: snippsat
  Big problem for installing PyCharm sylas 2 3,942 Nov-12-2017, 05:17 PM
Last Post: sylas
  Problem installing urlparse4 package BobLoblaw 2 5,759 Oct-06-2017, 05:16 PM
Last Post: BobLoblaw

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020