Python Forum
Using Dictionary to Test Evenness of Distribution Generated by Randint Function
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Using Dictionary to Test Evenness of Distribution Generated by Randint Function
#1
Hi,

This is my second post on any sort of coding website and I'm an absolute beginner at programming. I'm trying to learn Python and am currently about halfway through an introductory book in which the author presents a block of code showing how to test the "evenness" of the randint function in Python using a dictionary (taking the range of numbers from 1 to 10 as an example). I'm having trouble figuring out exactly what is going on in this code. Here it is:

frequency = {}
for i in range(1000):
     num = randint(1, 10)
     if frequency.has_key(num):
          frequency[num] = frequency[num] + 1
          else:
               frequency[num] = 1

print frequency
The output from this code is given in the book as follows:

Output:
{1: 99, 2: 105, 3: 93, 4: 114, 5: 81, 6: 95, 7: 82, 8: 116, 9: 118, 10: 97}
The author writes that "It's not perfectly even, but most statisticians would be happy with our result."

Here is what I think the code is doing:

The first line defines the "frequency" variable as an empty dictionary.
The second and third lines generate a random number between 1 and 10 1,000 times (I think).
The lines after this with the "if ... else" statement have me a bit stumped. It seems to me that the "if frequency.has_key(num)" criterion will always be met, so the resulting frequency of each number will always have 1 added to it. Why? Does this have something to do with the fact that the range function in Python increments by 1 by default? I'm having trouble understanding how the "if ... else" statement helps generate a frequency distribution for numbers 1 to 10.

Also, if I'm correct that the "if frequency.has_key(num)" criterion will always be met, is the "else" statement at the end of the block of code even necessary?

Thanks in advance for any help.
Reply
#2
(Feb-21-2021, 04:09 PM)new_coder_231013 Wrote: Hi,

This is my second post on any sort of coding website and I'm an absolute beginner at programming. I'm trying to learn Python and am currently about halfway through an introductory book in which the author presents a block of code showing how to test the "evenness" of the randint function in Python using a dictionary (taking the range of numbers from 1 to 10 as an example). I'm having trouble figuring out exactly what is going on in this code. Here it is:

frequency = {}
for i in range(1000):
     num = randint(1, 10)
     if frequency.has_key(num):
          frequency[num] = frequency[num] + 1
     else:
          frequency[num] = 1

print frequency
The output from this code is given in the book as follows:

Output:
{1: 99, 2: 105, 3: 93, 4: 114, 5: 81, 6: 95, 7: 82, 8: 116, 9: 118, 10: 97}
The author writes that "It's not perfectly even, but most statisticians would be happy with our result."

Here is what I think the code is doing:

The first line defines the "frequency" variable as an empty dictionary.
The second and third lines generate a random number between 1 and 10 1,000 times (I think).
The lines after this with the "if ... else" statement have me a bit stumped. It seems to me that the "if frequency.has_key(num)" criterion will always be met, so the resulting frequency of each number will always have 1 added to it. Why? Does this have something to do with the fact that the range function in Python increments by 1 by default? I'm having trouble understanding how the "if ... else" statement helps generate a frequency distribution for numbers 1 to 10.

Also, if I'm correct that the "if frequency.has_key(num)" criterion will always be met, is the "else" statement at the end of the block of code even necessary?

Thanks in advance for any help.

I corrected lines 6 and 7. They were indented one step too far.

Most of your assumptions are correct but when the dictionary "frequency" is created it is empty so if you you just use line 5:
frequency[num] = frequency[num] + 1
you will get an error as the key, so far, does not exist. Line 7:
frequency[num] = 1
on the other hand, introduces the key. You could, of course create the dictionary with all the keys and a zero value for each of them but that is a bit tedious, so most people do as in the given code, maybe with line 5 as
frequency[num] += 1
. If you want to simplify the loop then this would do the trick:
frequency = dict((el,0) for el in range(1,11))
for i in range(1000):
     frequency[randint(1, 10)] +=1

print(frequency)
where the first line creates a dictionary with ten keys, 1..10, and sets the value to zero for each of them.
Reply
#3
Every time the number num is produced by the random generator, the program adds 1 to the count frequency[num] which counts the number of times this particular number was generated. The test is necessary because if the key num is not already in the dictionary, the line frequency[num] = frequency[num] + 1 would raise an exception. Initialy the dictionary is empty and contains no key.
Reply
#4
(Feb-21-2021, 04:09 PM)new_coder_231013 Wrote: I'm trying to learn Python and am currently about halfway through an introductory book in
Also as new user you should not use Python 2.7,which as of Jan 2020 is officially dead💀
Code will give errors on print and has_key(removed) in Python 3.
So if fix indentation and write code so it work for Python 3.9 it look like this.
from random import randint

frequency = {}
for i in range(1000):
     num = randint(1, 10)
     if num in frequency:
          frequency[num] = frequency[num] + 1
     else:
        frequency[num] = 1

print(frequency)
Also a alternative is to use collections defaultdict as it's made for these kind of task.
import collections
from random import randint

frequency = collections.defaultdict(int)
for i in range(1000):
    frequency[randint(1, 10)] += 1

print(frequency)
Output:
defaultdict(<class 'int'>, {7: 99, 3: 98, 2: 98, 1: 105, 8: 92, 9: 103, 5: 111, 10: 101, 4: 106, 6: 87})
Can throw in Counter,to get most common.
>>> from collections import Counter
>>> 
>>> d = Counter(frequency)
>>> d.most_common(3)
[(5, 111), (4, 106), (1, 105)]
>>> d.most_common(1)
[(5, 111)]
Reply
#5
In addition to snippsat excellent answer - as always in programming Counter can be applied differently as well. It can be used in combination with with randint or choices for example:

+>>> import random
+>>> from collections import Counter
+>>> random.seed(42)              # for reproducibility
+>>> Counter(random.randint(1, 10) for i in range(1000))
Counter({4: 116, 5: 113, 2: 108, 8: 105, 9: 103, 7: 102, 10: 95, 1: 91, 3: 85, 6: 82})
+>>> Counter(random.choices(range(1, 11), k=1000))
Counter({6: 112, 8: 105, 10: 104, 1: 103, 5: 103, 4: 102, 2: 99, 3: 92, 7: 91, 9: 89})
I'm not 'in'-sane. Indeed, I am so far 'out' of sane that you appear a tiny blip on the distant coast of sanity. Bucky Katt, Get Fuzzy

Da Bishop: There's a dead bishop on the landing. I don't know who keeps bringing them in here. ....but society is to blame.
Reply
#6
One of the very few uses of counter that make it almost worth the disk space it occupies. It was brand new for Python 2.7

I would use dict.get(key, default) which is a safe way to retrieving a value from a dictionary. If the key is not found .get() returns the default value instead of raising a key error. Sorry, but I cannot make myself write this in Python 2.7.
from random import randint
 
frequency = {}
for i in range(1000):
    num = randint(1, 10)
    frequency[num] += frequency.get(num, 0)
 
print(frequency)
Reply
#7
Thanks everyone; this makes sense now!
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Weight Distribution 11drk9 11 510 Mar-13-2024, 06:08 AM
Last Post: Pedroski55
  random numbers, randint janeik 2 526 Nov-27-2023, 05:17 PM
Last Post: janeik
  Unexpected output while using random.randint with def terickson2367 1 467 Oct-24-2023, 05:56 AM
Last Post: buran
Information Best distribution method inovermyhead100 0 528 Jul-19-2023, 07:39 AM
Last Post: inovermyhead100
  HOW TO USE C# GENERATED DLL davide_vergnani 2 1,555 Jun-12-2023, 03:35 PM
Last Post: davide_vergnani
  passing dictionary to the function mark588 2 931 Dec-19-2022, 07:28 PM
Last Post: deanhystad
  WARNING: Ignoring invalid distribution kucingkembar 1 24,301 Sep-02-2022, 06:49 AM
Last Post: snippsat
  How do I use a whl puython distribution? barryjo 6 1,693 Aug-15-2022, 03:00 AM
Last Post: barryjo
  Pyinstaller distribution file seems too large hammer 4 2,629 Mar-31-2022, 02:33 PM
Last Post: snippsat
Sad Iterate randint() multiple times when calling a function Jake123 2 1,979 Feb-15-2022, 10:56 PM
Last Post: deanhystad

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020