Random data generation sum to 1 by rounding

Random data generation sum to 1 by rounding - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Random data generation sum to 1 by rounding (/thread-35126.html)

Random data generation sum to 1 by rounding - juniorcoder - Oct-01-2021

Hello everyone,
I need a small help. I need to generate random values between 0 and 1. They must be successively ordered. Let me give you an example. Starting index is 3 and ending index 5, so in an array who has 10 index, they must be ordered like this:

A=[0 0 0.2 0.3 0.5 0  0 0 0 0]

Actually I do it in my code, I generate an array whose elements 0. Then I delete the part including the dtating and ending index and I insert to that part randomly generated value list (whose sum is equal to 1 for sure ). I generate these random values with the following code:

    def sum_to_x(n, x):
         values = [0.0, x] + list(np.random.uniform(low=0.0,high=x,size=n-1))
         values.sort()
         return [values[i+1] - values[i] for i in range(n)]

The problem my values are not rounded at all. I obtain something like that

Output:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.29047860679220283 0.002342106757806439 0.5462946648931827 0.16088462155680805]

How can I round them (two decimal for example) by keeping their sum to 1, still?

Thank you so much in advance

RE: Random data generation sum to 1 by rounding - bowlofred - Oct-01-2021

Then it sounds like you don't really want independent random numbers (since they have to sum to 1). If you only want 2 decimal places, then I would suggest:
* pick n-1 numbers between 0 and 100
* sort them
* return the differences between
* divide by 100.

I don't believe this would have a uniform distribution any longer, but I'm not sure you can do that and satisfy the condition of summing to a particular value.

import random

def random_divisions(n, total_size=100):
    divisions = [0] + sorted(random.randrange(total_size) for _ in range(n)) + [total_size]
    return [(y-x)/total_size for x,y in zip(divisions, divisions[1:])]

r = random_divisions(4)
print(f"{r}  {sum(r)}")

Output:[0.31, 0.05, 0.04, 0.29, 0.31]  1.0
[0.14, 0.14, 0.33, 0.2, 0.19]  1.0

RE: Random data generation sum to 1 by rounding - juniorcoder - Oct-03-2021

Millions of thanks @bowlofred

I have a small question about your solution. When I run your code, I get sometimes a list like this:

Output:
[0.05, 0.0, 0.11, 0.26, 0.02, 0.07, 0.03, 0.1, 0.07, 0.29]

So there is a 0.0 in the second index.
How can I get rid of it and I always get strictly positive values as in your example?
Thank you so much

RE: Random data generation sum to 1 by rounding - Yoriz - Oct-03-2021

Could add an if statement to check for greater than 0

import random


def random_divisions(n, total_size=100):
    divisions = (
        [0] + sorted(random.randrange(total_size) for _ in range(n)) + [total_size]
    )
    return [
        result
        for x, y in zip(divisions, divisions[1:])
        if (result := (y - x) / total_size) > 0
    ]


r = random_divisions(4)
print(f"{r}  {sum(r)}")

RE: Random data generation sum to 1 by rounding - juniorcoder - Oct-03-2021

Thank you so much @Yoriz. It works well now

RE: Random data generation sum to 1 by rounding - juniorcoder - Oct-03-2021

Thank you so much @Yoriz. It works well now. But, it deletes that 0.0 now completely and reduces the number of the index. For example , My array has to include 10 index. Since there is one 0 in one of the indexes ,now the code removes that 0 and the dimension of array becomes 9. However , I should keep the same dimension. How can I do that ? Thank you

Output:
[0.11, 0.04, 0.2, 0.02, 0.11, 0.42, 0.02, 0.05, 0.03]

RE: Random data generation sum to 1 by rounding - deanhystad - Oct-03-2021

To avoid 0 I would use random.sample(). It guarantees each value is unique.

import random
 
def random_divisions(n, total_size=100):
    values = sorted([0, total_size] + random.sample(range(total_size), n-1))
    return sorted([(y-x)/total_size for x,y in zip(values , values [1:])])
 
r = random_divisions(5)
print(f"{r}  {sum(r)}")

Output:
[0.03, 0.04, 0.24, 0.33, 0.36]  1.0

RE: Random data generation sum to 1 by rounding - juniorcoder - Oct-04-2021

(Oct-03-2021, 01:02 PM)deanhystad Wrote: To avoid 0 I would use random.sample(). It guarantees each value is unique.

import random
 
def random_divisions(n, total_size=100):
    values = sorted([0, total_size] + random.sample(range(total_size), n-1))
    return sorted([(y-x)/total_size for x,y in zip(values , values [1:])])
 
r = random_divisions(5)
print(f"{r}  {sum(r)}")

Output:
[0.03, 0.04, 0.24, 0.33, 0.36]  1.0

Thank you so much @deanhystad. I guess now, it has been solved

RE: Random data generation sum to 1 by rounding - juniorcoder - Oct-20-2021

Hello @deanhystad and @bowlofred and hello everyone,

This time, I need to generate random values between 0 and 1 which are successively ordered but following normal distribution. Their sum will be 1 again , I need to do exactly the same thing as you proposed before but the values should be normally distributed. How can I modify the code(s) you proposed ? Thank you so much in advance.

(Oct-04-2021, 11:06 AM)juniorcoder Wrote:
(Oct-03-2021, 01:02 PM)deanhystad Wrote: To avoid 0 I would use random.sample(). It guarantees each value is unique.
import random
 
def random_divisions(n, total_size=100):
    values = sorted([0, total_size] + random.sample(range(total_size), n-1))
    return sorted([(y-x)/total_size for x,y in zip(values , values [1:])])
 
r = random_divisions(5)
print(f"{r}  {sum(r)}")
Output:
[0.03, 0.04, 0.24, 0.33, 0.36]  1.0
Thank you so much @deanhystad. I guess now, it has been solved

RE: Random data generation sum to 1 by rounding - deanhystad - Oct-20-2021

https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html