Python Forum
number accuracy problem? - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: number accuracy problem? (/thread-35855.html)



number accuracy problem? - roym - Dec-23-2021

Hi all,

I am trying to solve problem 78 from Project Euler (https://projecteuler.net/problem=78). The problem is related to integer partition. The question here is to find the first number, n, for which p(n) is divisible by 1e6. I have tested my algorithm for a small number of coins (<100) and it seems to work fine (it produces the correct results). My idea was to use the modulo operator to determine the number of coins for which the partition function = 0. However, I get the wrong results over and over again... I compared my results with other algorithms that produce the correct number, and it seems that it is related to the accuracy of the numbers that are determined by my algorithm. The answer should be 55374, but my algorithm finds the first n to be 4096. Any idea what is wrong here? Note that I have no experience with Python, so I haven't used any functions or whatever yet.

import numpy as np

# number of coins
N = 100000

# pre-allocate some arrays
# Pstore contains the partition values
# gstore contains the generalized pentagonal numbers
# signs contains alternating signs
Pstore = np.zeros((N,1))
gstore = np.zeros((N,1))
signs  = np.zeros((5*N,1))

# create array with alternating signs
for n in range(0,N):
    
    signs[4*n+1] = 1
    signs[4*n+2] = 1
    signs[4*n] = -1
    signs[4*n+3] = -1
signs = np.delete(signs, 0)

# start loop
for n in range (0,N):
    
    # partition function = 1 for n = 0 and n = 1
    if n == 0:
        
        Pstore[n] = 1
        
    elif n == 1:
        Pstore[n] = 1
        gstore[n] = 1
        
    # calculate generalized pentagonal number
    elif n > 1:
        
        if n % 2 == 0:
            
            gstore[n] = (3*(-int(n/2))**2-(-int(n/2)))/2
            
        else:
            
            gstore[n] = (3*(int(n/2)+1)**2-(int(n/2)+1))/2
            

        a = -1
        counter = 0
        # calculate partition number and store
        while a < 0:
            
            
            if n > gstore[counter+1]:
                
                Pstore[n] += signs[counter] * Pstore[int(n-gstore[counter+1])]
                
                counter += 1
                
            elif n == gstore[counter+1]:
                
                Pstore[n] += signs[counter] * Pstore[int(n-gstore[counter+1])]
                a = 0
                
            elif n < gstore[counter+1]:
            
                a = 0
    
    # print value if number is found
    if Pstore[n] % 1000000 == 0:
        print(Pstore[n])



RE: number accuracy problem? - deanhystad - Dec-23-2021

signs does not end up being an array containing alternating signs. It does start with a pattern like [-1, 1, 1, -1, -1, 1, 1...] but it ends with [...0, 0, 0, 0] where the number of zeroes == N.

When doing division the result is a float. I think you want to cast these to int or you can use integer division (// 2).


RE: number accuracy problem? - Gribouillis - Dec-23-2021

There are many issues in your code, before I talk about them, here is a code that I wrote to solve this problem, inspired by your code. It finds the solution 55374 in about 8 seconds in my terminal, but I made no attempt to optimize it.
import numpy as np
import itertools as itt

def ipentagonal():
    s = 0
    for c in itt.count(1):
        s += c
        if s % 3 == 0:
            yield s // 3

N = 1000000
#N = 100
gstore = np.fromiter(itt.takewhile(lambda x: x <= 2*N, ipentagonal()), int)
sign = np.fromiter(
    (1 if (i%4 <= 1) else -1 for i in range(gstore.size)), int)
pstore = np.zeros(N, int)
pstore[0] = 1
print(gstore)
print(sign)

for n in range(1, N):
    idx = itt.takewhile(lambda x: x>=0, (n-g for g in gstore))
    pstore[n] = sum(pstore[i] * sign[j] for j, i in enumerate(idx)) % 1000000
    if pstore[n] == 0:
        print(n)
        break
#######

The main problems that I see in your code are
  • You are using numpy arrays containing floating point values, that is to say 64 bits numbers. This will lose precision if you store large integers in them. This is an integer valued problem, so you need arrays that contain integer values, this is what I use in my code. Note that Python integers have no size limitation other than the machine's memory, but this is not the case of Numpy integers which have a fixed size, such as 32 or 64 bits.
  • The partition number increases very rapidly, according to Wikipedia, p(10000) has already 106 ordinary decimal digits. It means that you must not compute p(n), but for this problem you can simply compute p(n) % 1000000, which is what my code does.

#####

Note: By removing the final 'break' statement, the above code found the next solution after 55374, namely 488324.


RE: number accuracy problem? - Gribouillis - Dec-24-2021

Here is a better version, it finds the solution in 1 second, it also finds other solutions 488324 and 846699 in less than 15 seconds.
import numpy as np
import itertools as itt

def ipentagonal():
    s = 0
    for c in itt.count(1):
        s += c
        if s % 3 == 0:
            yield s // 3

N = 1000000
gstore = np.fromiter(itt.takewhile(lambda x: x <= 2*N, ipentagonal()), int)
sign = np.tile([1,1,-1,-1], (gstore.size + 4)//4)
pstore = np.zeros(N, int)
pstore[0] = 1

for n in range(1, N):
    a = np.searchsorted(gstore, (n,), 'right')[0]
    p = pstore[n - gstore[:a]]
    pstore[n] = np.sum(p * sign[:p.size]) % 1000000
    if pstore[n] == 0:
        print(n)
        break



RE: number accuracy problem? - roym - Dec-24-2021

(Dec-23-2021, 05:17 PM)deanhystad Wrote: signs does not end up being an array containing alternating signs. It does start with a pattern like [-1, 1, 1, -1, -1, 1, 1...] but it ends with [...0, 0, 0, 0] where the number of zeroes == N.

When doing division the result is a float. I think you want to cast these to int or you can use integer division (// 2).

Thanks for your reply. The "signs" array contains indeed 0's. However, I noticed that that was not the problem. The problem was in the Numpy arrays that uses floats rather than integers. I have removed all Numpy arrays and replaced them with "normal" arrays. Now the code produces the correct output.


RE: number accuracy problem? - roym - Dec-24-2021

(Dec-23-2021, 10:28 PM)Gribouillis Wrote: There are many issues in your code, before I talk about them, here is a code that I wrote to solve this problem, inspired by your code. It finds the solution 55374 in about 8 seconds in my terminal, but I made no attempt to optimize it.
import numpy as np
import itertools as itt

def ipentagonal():
    s = 0
    for c in itt.count(1):
        s += c
        if s % 3 == 0:
            yield s // 3

N = 1000000
#N = 100
gstore = np.fromiter(itt.takewhile(lambda x: x <= 2*N, ipentagonal()), int)
sign = np.fromiter(
    (1 if (i%4 <= 1) else -1 for i in range(gstore.size)), int)
pstore = np.zeros(N, int)
pstore[0] = 1
print(gstore)
print(sign)

for n in range(1, N):
    idx = itt.takewhile(lambda x: x>=0, (n-g for g in gstore))
    pstore[n] = sum(pstore[i] * sign[j] for j, i in enumerate(idx)) % 1000000
    if pstore[n] == 0:
        print(n)
        break
#######

The main problems that I see in your code are
  • You are using numpy arrays containing floating point values, that is to say 64 bits numbers. This will lose precision if you store large integers in them. This is an integer valued problem, so you need arrays that contain integer values, this is what I use in my code. Note that Python integers have no size limitation other than the machine's memory, but this is not the case of Numpy integers which have a fixed size, such as 32 or 64 bits.
  • The partition number increases very rapidly, according to Wikipedia, p(10000) has already 106 ordinary decimal digits. It means that you must not compute p(n), but for this problem you can simply compute p(n) % 1000000, which is what my code does.

#####

Note: By removing the final 'break' statement, the above code found the next solution after 55374, namely 488324.

Thanks for your reaction and explanation. I found that the problem is indeed in the Numpy arrays containing floating point values. I have removed all Numpy arrays and replaced them with "normal" arrays. Now everything works fine!