Problem with very large number for digital root

**scidam** · Apr-20-2018, 06:46 AM

Algorithm for finding digital root from its definition:

import numpy as np
import random
import time

start1 = time.time()
# Very large number
N = 20000000 # the number of digits

num = ''.join(np.random.choice(['0','1','2','3','4','5','6','7','8','9'], N))
#num is a string now...
print("Generating a very big number...#digits=%s, total seconds spent: %s " % (N, time.time() - start1))


# prepare the number
unum = np.fromiter(num, count=N, dtype=np.short)

# is_divisible_by_9 = lambda x: np.sum(x) % 9 == 0, this could help....


def digital_root(num):
    s = np.sum(num)
    while True:
        if 0 <= s <= 9:
            return s
        else:
            s = np.sum(np.fromiter(str(s), dtype=np.short))


start2 = time.time()         
print("Digital root is:", digital_root(unum))
print("Finding number's digital root, total seconds spent: %s " % (time.time() - start2,))
    


unum = map(int, num)
def digital_root_wo_numpy(num):
    s = sum(num)
    while True:
        if 0 <= s <= 9:
            return s
        else:
            s = sum(map(int, str(s)))


start3 = time.time()         
print("Digital root is:", digital_root_wo_numpy(unum))
print("Finding number's digital root without numpy, total seconds spent: %s " % (time.time() - start3,))

Output:Generating a very big number...#digits=20000000, total seconds spent: 8.759913921356201 
Digital root is: 7
Finding number's digital root, total seconds spent: 0.028986692428588867 
Digital root is: 7
Finding number's digital root without numpy, total seconds spent: 4.6343841552734375

Pleiades · (This post was last modified: Apr-20-2018, 06:58 AM by Pleiades.)

(Apr-19-2018, 08:54 PM)Gribouillis Wrote: You can also compute the result incrementally by reading the file by chunks:

 KB = 1024 MB = 1024 * 1024 def file_digital_root(file): iszero = True res = 0 for s in file: n = int(s) if iszero and n: iszero = False res = (res + n % 9) % 9 if iszero: return 0 else: return res if res else 9 if __name__ == '__main__': with open('big_num.txt', buffering=1*MB) as bigggg: dr = file_digital_root(bigggg) print(dr)

Ok its working super fast now it calculates huge numbers in about 3 to 7 seconds. Thanks a bunch guys.:) I just had to re fire up the shell and its all smooth..

wavic · (This post was last modified: Apr-20-2018, 08:05 AM by wavic.)

You should check gmpy2 library. The MPZ class: https://gmpy2.readthedocs.io/en/latest/overview.html

Without Python basics... At least learn how to pass parameters to a function. What is global and local name space. Basic io.

**Gribouillis** · (This post was last modified: Apr-21-2018, 07:51 AM by Gribouillis.)

Hum, there is something wrong in my previous code, the for line is file will read the entire file. One needs to be more explicit in order to read the file by chunks. Try this version instead

KB = 1024
MB = 1024 * 1024
 
def file_digital_root(file, buffering):
    iszero = True
    res = 0
    while True:
        s = file.read(buffering)
        if not s:
            break
        n = int(s)
        if iszero and n:
            iszero = False
        res = (res + n % 9) % 9 
    if iszero:
        return 0
    else:
        return res if res else 9
 
if __name__ == '__main__':
    buffering = 100 * KB
    with open('bignum.txt', buffering=buffering) as bigggg:
        dr = file_digital_root(bigggg, buffering)
    print(dr)

**scidam** · Apr-22-2018, 11:32 PM

Note, that conversion strings to integer (n = int(s)) is quite expensive operation, especially for big numbers. The test below shows that:

import numpy as np
import timeit

common = '''
import numpy as np
N = 100000
num = ''.join(np.random.choice(['0','1','2','3','4','5','6','7','8','9'], N))
dct = {str(j): j for j in range(10)}

'''


setup1 = common

setup2 = common + '''
def digital_root_wo_numpy(num):
    num = map(int, num)
    s = sum(num)
    while True:
        if 0 <= s <= 9:
            return s
        else:
            s = sum(map(int, str(s)))
'''



setup3 = common + '''
def digital_root_wo_numpy(num):
    num = map(lambda x: dct[x], num)
    s = sum(num)
    while True:
        if 0 <= s <= 9:
            return s
        else:
            s = sum(map(lambda x: dct[x], str(s)))
'''





print("Int() is quite expensive:", min(timeit.Timer('int(num)', setup=setup1).repeat(3, 10)))

print("Finding digital root using int:", min(timeit.Timer('digital_root_wo_numpy(num)', setup=setup2).repeat(3, 10)))

print("Finding digital root with dict mapping:", min(timeit.Timer('digital_root_wo_numpy(num)', setup=setup3).repeat(3, 10)))

Output:Int() is quite expensive: 1.0060029089800082
Finding digital root using int: 0.22873508604243398
Finding digital root with dict mapping: 0.19048730900976807

So, it is faster to compute sums of digits directly (for large numbers/large buffering), than use div mod (%) operation.

**Gribouillis** · (This post was last modified: Apr-23-2018, 05:27 AM by Gribouillis.)

@scidam You can also try sum(ord(x) for x in num) - ord('0') * len(num) or sum(ord(x) - 48 for x in num)

Pleiades · Apr-27-2018, 06:29 AM

(Apr-19-2018, 07:35 PM)wavic Wrote:
with open('big_num.txt', 'r') as bigggg: # 'r' for read num = bigggg.read() # It's string so... next line big_n = int(num)
This is how you open a file and read from it.

Wavic,

Hi how can I divide 3 into this script for the big number?

wavic · Apr-27-2018, 08:44 AM

I posted some links to gmpy2 module.
Havê you look at it?

Here it is again:
http://gmpy2.readthedocs.io/en/latest/mpz.html#examples

Try it.

Problem with very large number for digital root

User Panel Messages

Announcements