改进我对埃拉托色尼筛法的实施

Improving my implementation of the sieve of Eratosthenes

我正在创建埃拉托色尼筛法,以便更有效地求和 1 到较大数 n 之间的素数。我想要做的是创建一个从 2 到 n 的列表,然后删除 2 的倍数,然后是 3 的倍数,然后是列表中下一个数字的倍数,依此类推。我创建的代码我认为它的时间性能非常慢,这几乎就像通过检查每个条目是否为质数来创建列表一样。我想我拥有的操作数量是有序的: n 的平方根(第一个 while 循环)乘以(略小于)n 的平方根(对于第二个 while 循环)。所以我不确定是删除方法还是其他东西减慢了速度。

我的代码是这个:

def sieve_of_Eratosthenes(n):
L= list(range(2,n+1))
# creates a list from 2 to n

i=2
while i*i <=n: # going to remove multiples of i, starting at i^2
    k=i        # if j <i then ij already checked
    while i*k <= max(L):
        try:
            L.remove(i*k)   # there is an error if the element is not in 
                            # the list so need to add these two lines
        except ValueError:  
            pass     # do nothing!
        k=k+1
    i=L[i-1]      # list index starts at 0, want i to be next element in the list
# print(L)
return L

假设

The question is on how to improve the run time of your software since it's very slow.`

执行以下两个代码更改以加速您的代码

  1. 与其保留素数列表,不如将数字检查为素数(真) 或非 Prime(假)
  2. 只检查大于 2 的奇数是否为素数

代码

def sieve_of_Eratosthenes2(n):
    if n < 2:
        return []
    if n < 3:
        return [2]

    L = [True] * (n+1)    # all numbers set as primes initially

    # modifies prime flag in list for odd numbers
    for i in range(3, n, 2): # Check odd numbers for prime (no need to check even numbers)
        if L[i]: # A prime
            L[i*i::i] = [False]*len(L[i*i::i]) # from i^2 in increments of i

    # Report prime 2 + odd primes
    return [2] + [i for i in range(3, n, 2) if L[i]]  # Get odd numbers whose flag is 
                                                      # still True

新代码

%timeit sieve_of_Eratosthenes2(1000)
188 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit sieve_of_Eratosthenes2(100000)
16 ms ± 1.58 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In going from 1, 000 to 100, 000 time 
(i.e. 100X), time increased by ~85, 
so almost linear

旧代码

 %timeit sieve_of_Eratosthenes(1000)
 25.2 ms ± 1.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
 sieve_of_Eratosthenes2(100000)
 261.45 seconds  (using time module)

In going from 1, 000 to 100, 000 (100X)
time increased by factor of ~10, 000
So quadratic increase in time (i.e. 100^2).

说明

Sieve of Eratosthenes的复杂度是

O(N log (log N))

这几乎是线性的,因为将数组中的数字标记为 True(质数)和 False(非质数)的操作通常为 O(1)。

在原始算法中,非素数的数字被删除而不是被标记,需要:

O(N) per removal.

这为埃拉托色尼筛法的复杂度增加了一个额外的因子 N,导致原始算法的复杂度为:

O(N*N*log (log N))

因此 运行 次所证实的接近二次方。