Python Forum
Random data generation sum to 1 by rounding
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Random data generation sum to 1 by rounding
#1
Hello everyone,
I need a small help. I need to generate random values between 0 and 1. They must be successively ordered. Let me give you an example. Starting index is 3 and ending index 5, so in an array who has 10 index, they must be ordered like this:

A=[0 0 0.2 0.3 0.5 0  0 0 0 0]

Actually I do it in my code, I generate an array whose elements 0. Then I delete the part including the dtating and ending index and I insert to that part randomly generated value list (whose sum is equal to 1 for sure ). I generate these random values with the following code:

    def sum_to_x(n, x):
         values = [0.0, x] + list(np.random.uniform(low=0.0,high=x,size=n-1))
         values.sort()
         return [values[i+1] - values[i] for i in range(n)]
The problem my values are not rounded at all. I obtain something like that

Output:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.29047860679220283 0.002342106757806439 0.5462946648931827 0.16088462155680805]
How can I round them (two decimal for example) by keeping their sum to 1, still?

Thank you so much in advance
Yoriz write Oct-01-2021, 02:06 PM:
Please post all code, output and errors (in their entirety) between their respective tags. Refer to BBCode help topic on how to post. Use the "Preview Post" button to make sure the code is presented as you expect before hitting the "Post Reply/Thread" button.
Reply
#2
Then it sounds like you don't really want independent random numbers (since they have to sum to 1). If you only want 2 decimal places, then I would suggest:
* pick n-1 numbers between 0 and 100
* sort them
* return the differences between
* divide by 100.

I don't believe this would have a uniform distribution any longer, but I'm not sure you can do that and satisfy the condition of summing to a particular value.

import random

def random_divisions(n, total_size=100):
    divisions = [0] + sorted(random.randrange(total_size) for _ in range(n)) + [total_size]
    return [(y-x)/total_size for x,y in zip(divisions, divisions[1:])]

r = random_divisions(4)
print(f"{r}  {sum(r)}")
Output:
[0.31, 0.05, 0.04, 0.29, 0.31] 1.0 [0.14, 0.14, 0.33, 0.2, 0.19] 1.0
Reply
#3
Millions of thanks @bowlofred
I have a small question about your solution. When I run your code, I get sometimes a list like this:
Output:
[0.05, 0.0, 0.11, 0.26, 0.02, 0.07, 0.03, 0.1, 0.07, 0.29]
So there is a 0.0 in the second index.
How can I get rid of it and I always get strictly positive values as in your example?
Thank you so much
Reply
#4
Could add an if statement to check for greater than 0
import random


def random_divisions(n, total_size=100):
    divisions = (
        [0] + sorted(random.randrange(total_size) for _ in range(n)) + [total_size]
    )
    return [
        result
        for x, y in zip(divisions, divisions[1:])
        if (result := (y - x) / total_size) > 0
    ]


r = random_divisions(4)
print(f"{r}  {sum(r)}")
Reply
#5
Thank you so much @Yoriz. It works well now
Reply
#6
Thank you so much @Yoriz. It works well now. But, it deletes that 0.0 now completely and reduces the number of the index. For example , My array has to include 10 index. Since there is one 0 in one of the indexes ,now the code removes that 0 and the dimension of array becomes 9. However , I should keep the same dimension. How can I do that ? Thank you

Output:
[0.11, 0.04, 0.2, 0.02, 0.11, 0.42, 0.02, 0.05, 0.03]
Reply
#7
To avoid 0 I would use random.sample(). It guarantees each value is unique.
import random
 
def random_divisions(n, total_size=100):
    values = sorted([0, total_size] + random.sample(range(total_size), n-1))
    return sorted([(y-x)/total_size for x,y in zip(values , values [1:])])
 
r = random_divisions(5)
print(f"{r}  {sum(r)}")
Output:
[0.03, 0.04, 0.24, 0.33, 0.36] 1.0
Reply
#8
(Oct-03-2021, 01:02 PM)deanhystad Wrote: To avoid 0 I would use random.sample(). It guarantees each value is unique.
import random
 
def random_divisions(n, total_size=100):
    values = sorted([0, total_size] + random.sample(range(total_size), n-1))
    return sorted([(y-x)/total_size for x,y in zip(values , values [1:])])
 
r = random_divisions(5)
print(f"{r}  {sum(r)}")
Output:
[0.03, 0.04, 0.24, 0.33, 0.36] 1.0

Thank you so much @deanhystad. I guess now, it has been solved
Reply
#9
Hello @deanhystad and @bowlofred and hello everyone,

This time, I need to generate random values between 0 and 1 which are successively ordered but following normal distribution. Their sum will be 1 again , I need to do exactly the same thing as you proposed before but the values should be normally distributed. How can I modify the code(s) you proposed ? Thank you so much in advance.


(Oct-04-2021, 11:06 AM)juniorcoder Wrote:
(Oct-03-2021, 01:02 PM)deanhystad Wrote: To avoid 0 I would use random.sample(). It guarantees each value is unique.
import random
 
def random_divisions(n, total_size=100):
    values = sorted([0, total_size] + random.sample(range(total_size), n-1))
    return sorted([(y-x)/total_size for x,y in zip(values , values [1:])])
 
r = random_divisions(5)
print(f"{r}  {sum(r)}")
Output:
[0.03, 0.04, 0.24, 0.33, 0.36] 1.0

Thank you so much @deanhystad. I guess now, it has been solved
Reply
#10
https://numpy.org/doc/stable/reference/r...ormal.html
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  need help rounding joseph202020 7 1,342 Feb-21-2023, 08:13 PM
Last Post: joseph202020
  Node Flow Generation in Python Linenloid 0 659 Feb-21-2023, 07:09 PM
Last Post: Linenloid
  Allure Report Generation rotemz 0 798 Jan-24-2023, 08:30 PM
Last Post: rotemz
  from numpy array to csv - rounding SchroedingersLion 6 2,210 Nov-14-2022, 09:09 PM
Last Post: deanhystad
  Why doesnt chunk generation work? LotosProgramer 1 1,957 Apr-02-2022, 08:25 AM
Last Post: deanhystad
  Rounding issue kmll 1 1,428 Oct-08-2021, 10:35 AM
Last Post: Yoriz
Question PDF generation / edit SpongeB0B 2 2,094 Jul-28-2021, 05:59 AM
Last Post: SpongeB0B
  Not rounding to desired decimal places? pprod 2 2,580 Mar-05-2021, 11:11 AM
Last Post: pprod
  Calling Input for Random Generation ScaledCodingWarrior 1 1,867 Feb-02-2021, 07:27 PM
Last Post: bowlofred
  Decimal Rounding error project_science 4 2,783 Jan-06-2021, 03:14 PM
Last Post: project_science

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020