I'm trying to see the runtime of s.intersection(t) that takes two sets s and t and returns a new set with all the elements that occur in both s and t.

So, for example, i want to try running |s| = 1000 and |t| = 1000

I tried

import timeit
for x in range(1000):
t = timeit.Timer("s.intersection(t)", "s = set(1000)", "t = set(1000)")
t.timeit()

I keep getting errors no matter what variation I try of Timer(). Any ideas?

s = set(1000)

This code is not runnable.

(Jan-28-2018, 02:16 AM)egslava Wrote: [ -> ] s = set(1000)

This code is not runnable.

how do i fix this? i really can't find any good examples of timeit module using intersection

%%timeit
import numpy as np
NUM_ELEMS = 10
SCALE = 1000
s1 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s2 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s1.intersection(s2)

Quote:The slowest run took 4.28 times longer than the fastest. This could mean that an intermediate result is being cached.

10000 loops, best of 3: 20.8 µs per loop

This code creates two arrays of 10 random items, each is from range [0..1000), then make them sets (so the set size can be less than 1000, since duplicated elements are removed), then, it calculates their intersection.

It's all under %%timeit cell magic, so

`timeit`

will give you the "benchmark" result :)

(Jan-28-2018, 02:25 AM)egslava Wrote: [ -> ] %%timeit import numpy as np NUM_ELEMS = 10 SCALE = 1000 s1 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') ) s2 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') ) s1.intersection(s2) The slowest run took 4.28 times longer than the fastest. This could mean that an intermediate result is being cached. 10000 loops, best of 3: 20.8 µs per loop

This code creates two arrays of 1000 random elements, then make them sets (so the set size can be less than 1000, since duplicated elements are removed), then, it calculates their intersection.

what if i want to do s = 1000 and t = 10,000

One of the simplest way beside %%timeit cell magic,is just to put all in string.

import timeit
np_test = '''\
import numpy as np
NUM_ELEMS = 10
SCALE = 1000
s1 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s2 = set( (np.random.rand(NUM_ELEMS) * SCALE).astype('int') )
s1.intersection(s2)
'''
print(timeit.Timer(stmt=np_test).timeit(number=100000))

Output:

3.0972112079055183

How does that take the intersection into consideration?

To be honest, I can't really understand your question, what is 't' and what is 's'?

You mean, that one set should have 1000 random elements and the other set 10,000? This way then:

In [1]: %%timeit
...:
...: import numpy as np
...: SCALE = 1000
...: s1 = set( (np.random.rand(1000) * SCALE).astype('int') )
...: s2 = set( (np.random.rand(10000) * SCALE).astype('int') )
...: s1.intersection(s2)
...:
The slowest run took 748.40 times longer than the fastest. This could mean that
an intermediate result is being cached.
1 loop, best of 3: 1.28 ms per loop

(Jan-28-2018, 03:43 AM)Miraclefruit Wrote: [ -> ]How does that take the intersection into consideration?

What code dos do not matter at all for what i show,the point was to show an easy way to run timeit.

Tuple are faster than list,and boost number executions to 100000000

import timeit
tuple_test = '''\
t = (1,2,3,4,5,6,7,8,9,10)
'''
list_test = '''\
t = [1,2,3,4,5,6,7,8,9,10]
'''
print(timeit.Timer(stmt=tuple_test).timeit(number=100000000))

Got it! Thanks.

Yeah they're just 2 different sets