Python Forum

Full Version: random.seed and random.randn
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
import numpy as np 
np.random.seed(42)
x = np.random.randn(100)
I have a problem with understanding this simple code. Firstly, why 42 in parenthesis? Zero there would mean that the whole array is going to be remembered as it is. What's the difference with the other number inside?
regarding randn it gives standard normal distribution but I don't understand why does it give more numbers than 100. What 100 represents?

output:
Output:
[ 0.49671415 -0.1382643 0.64768854 1.52302986 -0.23415337 -0.23413696 1.57921282 0.76743473 -0.46947439 0.54256004 -0.46341769 -0.46572975 0.24196227 -1.91328024 -1.72491783 -0.56228753 -1.01283112 0.31424733 -0.90802408 -1.4123037 1.46564877 -0.2257763 0.0675282 -1.42474819 -0.54438272 0.11092259 -1.15099358 0.37569802 -0.60063869 -0.29169375 -0.60170661 1.85227818 -0.01349722 -1.05771093 0.82254491 -1.22084365 0.2088636 -1.95967012 -1.32818605 0.19686124 0.73846658 0.17136828 -0.11564828 -0.3011037 -1.47852199 -0.71984421 -0.46063877 1.05712223 0.34361829 -1.76304016 0.32408397 -0.38508228 -0.676922 0.61167629 1.03099952 0.93128012 -0.83921752 -0.30921238 0.33126343 0.97554513 -0.47917424 -0.18565898 -1.10633497 -1.19620662 0.81252582 1.35624003 -0.07201012 1.0035329 0.36163603 -0.64511975 0.36139561 1.53803657 -0.03582604 1.56464366 -2.6197451 0.8219025 0.08704707 -0.29900735 0.09176078 -1.98756891 -0.21967189 0.35711257 1.47789404 -0.51827022 -0.8084936 -0.50175704 0.91540212 0.32875111 -0.5297602 0.51326743 0.09707755 0.96864499 -0.70205309 -0.32766215 -0.39210815 -1.46351495 0.29612028 0.26105527 0.00511346 -0.23458713]
Concerning randn(), your output has length 100, so that there is no issue.

Concerning seed(), the purpose of such functions is to generate repeatable pseudo random sequences, for example
>>> np.random.seed(42)
>>> np.random.randn(5)
array([ 0.49671415, -0.1382643 ,  0.64768854,  1.52302986, -0.23415337])
>>> np.random.seed(42) # restart the generator with the same seed
>>> np.random.randn(5)
array([ 0.49671415, -0.1382643 ,  0.64768854,  1.52302986, -0.23415337]) # same random sequence
>>> np.random.seed(3748867) # a different seed
>>> np.random.randn(5)
array([-1.87666266,  2.5862967 , -0.49532264, -0.01653639,  0.01915029]) # a different random sequence
>>> 
That said, numpy's documentation describes seed() as a convenience, legacy function. It means that one shouldn't call it but probably use RandomState instances instead, which also have seed() and randn() methods (be cautious however as RandomState is also described as a legacy class in other parts of the documentation, seeding seems to be a tricky subject. As an average user, one can perhaps avoid the details).
(May-29-2020, 05:51 AM)Gribouillis Wrote: [ -> ]Concerning randn(), your output has length 100, so that there is no issue.
Yes, now I see that you're right. I must've been drunk while counting it the last night.

(May-29-2020, 05:51 AM)Gribouillis Wrote: [ -> ]Concerning seed(), the purpose of such functions is to generate repeatable pseudo random sequences, for example
>>> np.random.seed(42)
>>> np.random.randn(5)
array([ 0.49671415, -0.1382643 ,  0.64768854,  1.52302986, -0.23415337])
>>> np.random.seed(42) # restart the generator with the same seed
>>> np.random.randn(5)
array([ 0.49671415, -0.1382643 ,  0.64768854,  1.52302986, -0.23415337]) # same random sequence
>>> np.random.seed(3748867) # a different seed
>>> np.random.randn(5)
array([-1.87666266,  2.5862967 , -0.49532264, -0.01653639,  0.01915029]) # a different random sequence
>>> 
That said, numpy's documentation describes seed() as a convenience, legacy function. It means that one shouldn't call it but probably use RandomState instances instead, which also have seed() and randn() methods (be cautious however as RandomState is also described as a legacy class in other parts of the documentation, seeding seems to be a tricky subject. As an average user, one can perhaps avoid the details).
I understand seed() method from the beginning but don't see why 42 is in parenthesis. Could it be any other number? In documentation, we have 0.
Output:
Help on method_descriptor: seed(...) seed(seed=None) Seed the generator. This method is called when `RandomState` is initialized. It can be called again to re-seed the generator. For details, see `RandomState`. Parameters ---------- seed : int or array_like, optional Seed for `RandomState`. Must be convertible to 32 bit unsigned integers. See Also -------- RandomState
by the way, here is the full code
np.random.seed(42)
x = np.random.randn(100)

# compute a histogram by hand
bins = np.linspace(-5, 5, 20)
counts = np.zeros_like(bins)

# find the appropriate bin for each x
i = np.searchsorted(bins, x)

# add 1 to each of these bins
np.add.at(counts, i, 1)
find the appropriate bin for each x is also not clear to me.
ok, I get it. It finds the position of x in bins order.