素筛返回一些奇怪的数字

prime sieve returning some strange numbers

我正在尝试开发素数筛法,理论上我的算法在纸面上非常合理,但是 returns 仅在平方根以上的素数中选择非常短的合数。

例如,以 10,000(平方根为 100)的极限(找出所有达到极限的质数)为例,它与质数混合的合数为 115、119、121,和 125(都非常接近(甚至更高!)100)。

请告诉我我的代码有什么问题,哪些部分需要修复/如何修复。

澄清:我担心它的复合(非素数)returns,我在素性测试中哪里出错了,我该如何纠正它?

到目前为止,这是我的筛子:

def primes(limit):
    # Just make an empty list where the primes go
    prime = []
    # This next for loop just adds all the numbers of the form 6n+/-1 to the list, as all primes are of this form
    for i in range(1,limit / 6 + 1):
        prime.append(6*i - 1)
        prime.append(6*i + 1)
    # If the limit is divisible by 6, the last number on the list is sometimes over the limit
    if limit % 6 == 0 and prime[-1] > limit:
        prime.remove(prime[-1])
    # This next line just finds the place of the 'square root' of the limit, which is as high as it has to check for factors
    squareroot = min(range(len(prime)), key=lambda i: abs(prime[i]-(limit**0.5))) + 1
    # Removing composites BELOW the square root
    for p in prime[:squareroot][:]:
        for f in range(2, int(p ** 0.5) + 1):
            if p % f == 0:
                prime.remove(p)
                break
    # Removing composites ABOVE the square root
    for f in prime[:squareroot][:]:
        for p in prime[squareroot:]:
            if p % f == 0:
                prime.remove(p)
    return [2, 3] + prime

一旦删除平方根以下的素数,就不能再使用 squareroot 作为 primes 的索引,因为 primes 的长度将发生变化。

我根据 T. Silver 的回复修复了它 -- 我只是这样做了,以便在确保低于极限的平方根的所有东西都是素数之后,它再次找到平方根。这是固定代码:

# Just make an empty list where the primes go
prime = []
# This next for loop just adds all the numbers of the form 6n+/-1 to the list, as all primes are of this form
for i in range(1,limit / 6 + 1):
    prime.append(6*i - 1)
    prime.append(6*i + 1)
# If the limit is divisible by 6, the last number on the list is sometimes over the limit
if limit % 6 == 0 and prime[-1] > limit:
    prime.remove(prime[-1])
# This next line just finds the place of the 'square root' of the limit, which is as high as it has to check for factors
squareroot = min(range(len(prime)), key=lambda i: abs(prime[i]-(limit**0.5))) + 1
# Removing composites BELOW the square root
for p in prime[:squareroot][:]:
    for f in range(2, int(p ** 0.5) + 1):
        if p % f == 0:
            prime.remove(p)
            break
# Here's where i put the fix!
squareroot = min(range(len(prime)), key=lambda i: abs(prime[i]-(limit**0.5))) + 1
# Removing composites ABOVE the square root
for f in prime[:squareroot][:]:
    for p in prime[squareroot:]:
        if p % f == 0:
            prime.remove(p)
return [2, 3] + prime

您获得高于平方根的复合材料的一个原因是您的循环是如何构建的。当从列表中删除一个项目时,较高索引的所有项目都会向下移动一个。因此,当在第一个循环中删除一个项目时,平方根向下移动。当第二个循环开始时,squareroot 不再是平方根的索引。

# Removing composites BELOW the square root
for p in prime[:squareroot][:]:
    for f in range(2, int(p ** 0.5) + 1):
        if p % f == 0:
            prime.remove(p)  # <- you removed an item from `prime`, so the
            break            # square root is now at prime[squareroot - 1]

# Removing composites ABOVE the square root
for f in prime[:squareroot][:]:    #    now p[squareroot] is actually a number
    for p in prime[squareroot:]:   # <- above the real square root, so this 
        if p % f == 0:             #    loop starts too high
            prime.remove(p)

修复它的一种方法是在第一个循环中删除一个值时调整 squareroot 的值。另一种方法是在第二个循环之前重新计算 squareroot

在遍历列表时从列表中添加或删除项目通常不是一个好主意。例如,可以在一次传递中标记项目(例如将它们设置为零或 None),然后可以在第二次传递中复制未标记的项目。

编辑 添加了示例代码以标记复合材料:

# Removing composites BELOW the square root
for i,p in enumerate(prime[:squareroot]):
    for f in range(2, int(p ** 0.5) + 1):
        if p % f == 0:
            prime[i] = 0  # <- mark composites
            break         

# Removing composites ABOVE the square root
for f in prime[:squareroot]:    
    if f == 0: 
        continue                              # skip composites
    for i,p in enumerate(prime[squareroot:]): # <- this loop is okay now
        if p % f == 0:            
            prime[i] = 0                      # mark composite

# at this point, prime contains 0's where the compsites were found
# and non-zeros for the primes.  Just need to collect all the
# non-zero elements.
result = []         
for p in prime:     
    if p:                    
        result.append(p)

您的代码还有其他问题,但这应该可以回答您的直接问题。当你对 python 越来越熟练时,你会看到你可以做的进一步改进(一个素数筛可以写成大约 6 行 python)。