Python Forum

I'm trying to see the runtime of s.intersection(t) that takes two sets s and t and returns a new set with all the elements that occur in both s and t.

So, for example, i want to try running |s| = 1000 and |t| = 1000

I tried

import timeit
for x in range(1000):
    t = timeit.Timer("s.intersection(t)", "s = set(1000)", "t = set(1000)")

t.timeit()

I keep getting errors no matter what variation I try of Timer(). Any ideas?

s = set(1000)

This code is not runnable.

(Jan-28-2018, 02:16 AM)egslava Wrote: [ -> ]
 s = set(1000) 
This code is not runnable.

how do i fix this? i really can't find any good examples of timeit module using intersection

%%timeit

import numpy as np
NUM_ELEMS = 10
SCALE = 1000
s1 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s2 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s1.intersection(s2)

Quote:The slowest run took 4.28 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 20.8 µs per loop

This code creates two arrays of 10 random items, each is from range [0..1000), then make them sets (so the set size can be less than 1000, since duplicated elements are removed), then, it calculates their intersection.

It's all under %%timeit cell magic, so timeit will give you the "benchmark" result :)

(Jan-28-2018, 02:25 AM)egslava Wrote: [ -> ]
 %%timeit import numpy as np NUM_ELEMS = 10 SCALE = 1000 s1 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') ) s2 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') ) s1.intersection(s2) The slowest run took 4.28 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 20.8 µs per loop
This code creates two arrays of 1000 random elements, then make them sets (so the set size can be less than 1000, since duplicated elements are removed), then, it calculates their intersection.

what if i want to do s = 1000 and t = 10,000

One of the simplest way beside %%timeit cell magic,is just to put all in string.

import timeit

np_test = '''\
import numpy as np

NUM_ELEMS = 10
SCALE = 1000
s1 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s2 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s1.intersection(s2)
'''

print(timeit.Timer(stmt=np_test).timeit(number=100000))

Output:
3.0972112079055183

How does that take the intersection into consideration?

To be honest, I can't really understand your question, what is 't' and what is 's'?

You mean, that one set should have 1000 random elements and the other set 10,000? This way then:

In [1]: %%timeit
   ...:
   ...: import numpy as np
   ...: SCALE = 1000
   ...: s1 = set( (np.random.rand(1000) * SCALE).astype('int') )
   ...: s2 = set( (np.random.rand(10000) * SCALE).astype('int') )
   ...: s1.intersection(s2)
   ...:
The slowest run took 748.40 times longer than the fastest. This could mean that
an intermediate result is being cached.
1 loop, best of 3: 1.28 ms per loop

(Jan-28-2018, 03:43 AM)Miraclefruit Wrote: [ -> ]How does that take the intersection into consideration?

What code dos do not matter at all for what i show,the point was to show an easy way to run timeit.
Tuple are faster than list,and boost number executions to 100000000 Wink

import timeit

tuple_test = '''\
t = (1,2,3,4,5,6,7,8,9,10)
'''

list_test = '''\
t = [1,2,3,4,5,6,7,8,9,10]
'''

print(timeit.Timer(stmt=tuple_test).timeit(number=100000000))

Got it! Thanks.
Yeah they're just 2 different sets

Miraclefruit

egslava

Miraclefruit

egslava

Miraclefruit

snippsat

Miraclefruit

egslava

snippsat

Miraclefruit