Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
itertools and amazing speed
#1
From the thread about getting random sets of image names we had 24 image names:

images = ['im1', 'im2', 'im3', 'im4', 'im5', 'im6', 'im7', 'im8',
'im9', 'im10', 'im11', 'im12', 'im13', 'im14', 'im15',
'im16', 'im17', 'im18','im19','im20','im21','im22','im23','im24']

Something funny is going on with itertools:

If I do this:

perm_set = itertools.permutations(images,9)
This completes in the time it takes to press and release the enter key.

There are n!/(n-r)! permutations for any given n and r.

Set n = 24, r = 9, we have 24!/15! = 474 467 051 520 permutations.

Has my computer really created nearly 500 trillion 9-string tuples in milliseconds? Seems unlikely!

I don't know what goes on behind the scenes in Python, I believe a lot of modules are written in C, but that still seems extremely fast!

My laptop is fast with 2 Ryzen R7 processors, but that beggars belief!

Has it really made 500 trillion tuples? Has itertools created a generator and nothing else??
Reply
#2
Yes, per the documentation for the module, the functions return iterators.
Pedroski55 likes this post
Reply
#3
perm_set is a permutations object, a generator like object that creates results on demand. To see how long this takes you could ask for a list. I doubt you have enough memory.
perm_set = list(itertools.permutations(images,9))
That an itertools.permutations object is a generator is mentioned in the documentation when they say permutations is "roughly equivalent to"
def permutations(iterable, r=None):
    # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) --> 012 021 102 120 201 210
    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return
    indices = list(range(n))
    cycles = list(range(n, n-r, -1))
    yield tuple(pool[i] for i in indices[:r])  # <------ The first tuple.
    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
            else:
                j = cycles[i]
                indices[i], indices[-j] = indices[-j], indices[i]
                yield tuple(pool[i] for i in indices[:r])   # <------ More tuples
                break
        else:
            return
Pedroski55 likes this post
Reply
#4
Thanks!

I didn't try to make a list of itertools.permutations(images,9), but I did run:

for i in perm_set:
    print(i)
After I'd had breakfast it still hadn't finished, so I did ctrl + c. I hadn't looked at how many permutations 9 from 24 gives!

So how much memory will an iterator or generator object like that occupy, can you give an indication?

@dean Thanks for the function. It will take me a while to digest!
Reply
#5
(Nov-07-2022, 11:07 PM)Pedroski55 Wrote: So how much memory will an iterator or generator object like that occupy, can you give an indication?

The generator is code. Each time it is called, it runs code to create the next item. So it only needs enough memory to maintain the state where it needs to pick up the next time.

Unless you're storing all the output somewhere, this doesn't create any memory pressure. Previously calculated items are discarded.

Attempting to store all the output items in a list or similar (like myperms = list(perm_set)) is going to take a huge amount of memory.
Pedroski55 and Gribouillis like this post
Reply
#6
The number of permutations of 9 from 24 is enormous, but the number of combinations is much less. As a list, it is < 12MB.

com_set = itertools.combinations(images, 9)
# creates a list of tuples
mylist = list(com_set)
len(mylist) # 1307504
print(sys.getsizeof(mylist))
Output:
>>> type(com_set) <class 'itertools.combinations'> >>> print(sys.getsizeof(com_set)) 144 >>>
What I'm wondering is however:

Is there some way to interfere with the value of the next pointer in the class com_set?

Some way to pass it a random number as the stored value?
Reply
#7
(Nov-10-2022, 12:49 AM)Pedroski55 Wrote: Is there some way to interfere with the value of the next pointer in the class com_set?
This is a slightly different problem, but I implemented a Combinations class with random access, which allows to access directly the n-th element of the sequence, even with very large numbers. I wonder why this feature is not included in itertools.combinations()
Pedroski55 likes this post
Reply
#8
Impressive!

Thanks a lot, hope you don't mind me copying it.

I can't say I understand it yet, but I'll look at it until I do!
Reply
#9
(Nov-10-2022, 11:15 PM)Pedroski55 Wrote: Thanks a lot, hope you don't mind me copying it.
I added the MIT License, so that you can freely copy it and modify it.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  What happens to a <itertools.permutations object at 0x7fe3cc66af68> after it is read? Pedroski55 3 2,417 Nov-29-2020, 08:35 AM
Last Post: DeaD_EyE
  Making lists using itertools and eliminating duplicates. mike3891 2 2,253 Oct-26-2020, 05:39 PM
Last Post: bowlofred
  Python3 itertools.groupby printing the key august 1 2,097 Aug-17-2020, 05:46 AM
Last Post: bowlofred
  Generate Cartesian Products with Itertools Incrementally CoderMan 2 1,856 Jun-04-2020, 04:51 PM
Last Post: CoderMan
  Python module speed or python speed in general Enrique6 1 1,852 May-04-2020, 06:21 PM
Last Post: micseydel
  itertools.zip_shortest() fo unequal iterators Skaperen 10 6,757 Dec-27-2019, 12:17 AM
Last Post: Skaperen
  can itertools compact a list removing all of some value? Skaperen 6 3,176 Sep-02-2019, 03:19 AM
Last Post: Skaperen
  Help with itertools jarrod0987 1 1,820 Jun-10-2019, 02:41 AM
Last Post: Larz60+
  ImportError: No module named jaraco.itertools in Python manhnt 0 2,991 Nov-08-2018, 11:41 AM
Last Post: manhnt
  Need help with itertools.islice() Charles1 2 2,843 Sep-19-2018, 10:32 AM
Last Post: Charles1

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020