有效地计算没有尾随零的阶乘？

Question

我正在尝试改进大数阶乘计算的运行时间。

第一个简单循环和相乘的代码。

def calculate_factorial_multi(number):
    '''
    This function takes one agruments and
    returns the factorials of that number


    This function uses the approach successive multiplication

    like 8! = 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1
    '''

    '''
    If 0 or 1 retrun immediately
    ''' 
    if number == 1 or number == 0:
        return 1

    result = 1 # variable to hold the result

    for x in xrange(1, number + 1, 1):
        result *= x
    return result

此函数的分析结果：

For n = 1000 -- Total time: 0.001115 s

for n = 10000 -- Total time: 0.035327 s

for n = 100000 -- Total time: 3.77454 s.

从 n = 100000 的行分析器中，我可以看到大部分 %time 都花在了乘法步骤上，即“98.8”

31    100000      3728380     37.3     98.8         result *= x

所以试图减少阶乘的乘法减半，对于偶数，因此进行强度降低。

后半乘法代码：

def calculate_factorial_multi_half(number):

    if number == 1 or number == 0:
        return 1

    handle_odd = False
    upto_number = number

    if number & 1 == 1:
        upto_number -= 1
        print upto_number
        handle_odd = True

    next_sum = upto_number
    next_multi = upto_number
    factorial = 1

    while next_sum >= 2:
        factorial *= next_multi
        next_sum -= 2
        next_multi += next_sum

    if handle_odd:
        factorial *= number

    return factorial

此函数的分析结果：

For n = 1000 -- Total time: 0.00115 s

for n = 10000 -- Total time: 0.023636 s

for n = 100000 -- Total time: 3.65019 s

这表明在中等范围内有所改善，但在缩放方面没有太大改善。

在这个函数中，大部分 %time 都花在了乘法上。

61     50000      3571928     71.4     97.9         factorial *= next_multi.

所以我厌倦了删除尾随零然后相乘。

没有尾随零代码。

def calculate_factorial_multi_half_trailO(number):
    '''
    Removes the trailling zeros
    '''
    if number == 1 or number == 0:
        return 1

    handle_odd = False
    upto_number = number

    if number & 1 == 1:
        upto_number -= 1
        handle_odd = True

    next_sum = upto_number
    next_multi = upto_number
    factorial = 1
    total_shift = 0
    while next_sum >= 2:
        factorial *= next_multi
        shift = len(str(factorial)) - len(str(factorial).rstrip('0'))
        total_shift += shift
        factorial >>= shift
        next_sum -= 2
        next_multi += next_sum

    if handle_odd:
        factorial *= number

    factorial <<= total_shift
    return factorial

此函数的分析结果：

For n = 1000 -- Total time: 0.061524 s

for n = 10000 -- Total time: 113.824 s

因此，由于字符串转换，“96.2%”的时间花在了字符串转换上，而不是减少时间，而是增加了时间

 22       500        59173    118.3     96.2        shift = len(str(factorial)) - len(str(factorial).rstrip('0')).

所以我的问题是如何在不影响时间的情况下获取尾随零并有效地使用 shift。

所有分析都已完成。初级 OS(Linux)：64 位，内存：6GB

Answer 1

我不知道这是否能解决你的问题，但你可以试试这个方法：

我看到您的要求是 10^4（最大值）的阶乘。所以，

创建一个筛子并找出不超过 10000 的所有素数。
现在，对从 1 到 10000 的所有数字进行质因数分解，并将其存储在数组中。（这两步应该不会花太多时间）。
因此，您现在将正好拥有 1229 个素数及其次方。
获得所有素数的幂并将它们全部相乘。对于长数，这会将乘法运算的次数从 10000 次减少到 1229 次。（但是，同时求幂需要一些时间。）

PS：（我对python不是很熟悉，否则我自己做）

Answer 2

没有尾随零似乎效率不高。

首先，我建议使用prime decomposition来减少乘法总数，因为小于x的素数大约是x/lnx。

def calculate_factorial_multi(number):
    prime = [True]*(number + 1)
    result = 1
    for i in xrange (2, number+1):
        if prime[i]:
            #update prime table
            j = i+i
            while j <= number:
                prime[j] = False
                j += i
            #calculate number of i in n!
            sum = 0
            t = i
            while t <= number:
                sum += number/t
                t *= i
            result *= i**sum
    return result

for n = 10000, total time : 0.017s

for n = 100000, total time : 2.047s

for n = 500000, total time : 65.324s

(PS。在你的第一个程序中，n = 100000，在我的机器上总时间是 3.454s。)

现在让我们测试一下它是否有效而不带尾随零。尾随零的个数等于 n! 中素数 5 的个数。程序是这样的

def calculate_factorial_multi2(number):
    prime = [True]*(number + 1)
    result = 1
    factor2 = 0
    factor5 = 0
    for i in xrange (2, number+1):
        if prime[i]:
            #update prime table
            j = i+i
            while j <= number:
                prime[j] = False
                j += i
            #calculate the number of i in factors of n!
            sum = 0
            t = i
            while t <= number:
                sum += number/t
                t *= i
            if i == 2:
                factor2 = sum
            elif i == 5:
                factor5 = sum
            else:
                result *= i**sum

    return (result << (factor2 - factor5))*(10**factor5)

for n = 10000, total time : 0.015s

for n = 100000, total time : 1.896s

for n = 500000, total time : 57.101s

只是比以前快了一点。所以没有尾随零似乎不是很有用

有效地计算没有尾随零的阶乘？

Calculating the factorial without trailing zeros efficiently?

python

algorithm

optimization

profiling

factorial