Eratosthenes 筛法的实施和比较
Sieve of Eratosthenes implementations and comparisons
除了时间复杂度为O(N log log N)的埃拉托色尼筛法的简单实现之外,我还尝试实现了一个时间复杂度为O(N)的修改。虽然,两者都产生了预期的结果,但与下一个相比,前一个花费的时间要少得多,我无法弄清楚为什么。我真的很感激一些关于这方面的建议。
实施 1:
def build_sieve_eratosthenes(num):
## Creates sieve of size (num+1) to correct for 0-indexing.
primes_sieve = [1] * (num+1)
primes_sieve[0] = primes_sieve[1] = 0
for p in range(2, num):
if primes_sieve[p] == 1:
for mul in range(2*p, num+1, p):
primes_sieve[mul] = 0
return primes_sieve
实施 2:
def build_sieve_eratosthenes_linear(num):
## Creates sieve of size (num+1) to correct for 0-indexing.
primes_sieve = [1] * (num+1)
primes_sieve[0] = primes_sieve[1] = 0
## Builds a list of size (num+1) recording the smallest prime factor of each number.
SPF = [1] * (num+1)
## Builds a list of all primes seen so far with pos indicator of position where to insert the next prime.
## Uses a large fixed memory allocation scheme to avoid the usage of append list operation.
primes = [0] * num
pos = 0
for p in range(2, num+1):
if primes_sieve[p] == 1:
primes[pos] = p
pos = pos + 1
## Smallest prime factor of a prime is a prime itself.
SPF[p] = p
for i in range(0, pos):
if p * primes[i] <= num and primes[i] <= SPF[p]:
primes_sieve[p*primes[i]] = 0
SPF[p * primes[i]] = primes[i]
else:
break
return primes_sieve
test_num = 2000000
测试方法
def find_sum_of_primes_upto_num_with_sieve(sieve, num):
primes_sieve = sieve(num)
primes_sum = 0
for n in range(len(primes_sieve)):
if primes_sieve[n] == 1:
primes_sum = primes_sum + n
return primes_sum
结果:
start2 = time.time()
sum_2 = find_sum_of_primes_upto_num_with_sieve(build_sieve_eratosthenes, test_num)
end2 = time.time()
print("Sum of primes obtained: ", sum_2)
print("Time taken by checking primality of each number is %f sec" % (end2 - start2))
获得的素数和:142913828922
检查每个数字的素数所花费的时间是 0.647822 秒
start3 = time.time()
sum_3 = find_sum_of_primes_upto_num_with_sieve(build_sieve_eratosthenes_linear, test_num)
end3 = time.time()
print("Sum of primes obtained: ", sum_3)
print("Time taken by checking primality of each number is %f sec" % (end3 - start3))
获得的素数和:142913828922
检查每个数字的素数所花费的时间是 1.561308 秒
我在每个例程中放置了一个简单的迭代计数器,运行 从 10^3 到 10^7 的 10 的幂
build_sieve_eratosthenes:
1000 has 1958 iterations in sieve
10000 has 23071 iterations in sieve
100000 has 256808 iterations in sieve
1000000 has 2775210 iterations in sieve
10000000 has 29465738 iterations in sieve
build_sieve_eratosthenes_线性:
1000 has 831 iterations in sieve_linear
10000 has 8770 iterations in sieve_linear
100000 has 90407 iterations in sieve_linear
1000000 has 921501 iterations in sieve_linear
10000000 has 9335420 iterations in sieve_linear
你的 linear
函数是 不是 线性的:注意内部循环,它运行 pos
次......而 pos
是一个计算找到的素数数量,不是常数。
linear
比 "normal" 函数增长得更慢,并且总体迭代次数明显减少。但是,每次迭代的成本都更高,这就是您看到 "inverted" 次的原因。在您的 linear
函数中找到的每个数字和每个 "cross-out" 都更昂贵;较慢的增长并没有赶上你的 2*10^6 的限制,而不是我的 10*7 的限制。如果这对您来说值得的话,您可以将其推断出大约一天,以便更好地了解适当的时间……但是中央 "problem" 是每个数字的处理速度较慢。
对于细节好奇,这里是完整的输出:
1000 has 1958 iterations in sieve
Sum of primes obtained: 76127
Time taken by checking primality of each number is 0.000904 sec
10000 has 23071 iterations in sieve
Sum of primes obtained: 5736396
Time taken by checking primality of each number is 0.008270 sec
100000 has 256808 iterations in sieve
Sum of primes obtained: 454396537
Time taken by checking primality of each number is 0.067962 sec
1000000 has 2775210 iterations in sieve
Sum of primes obtained: 37550402023
Time taken by checking primality of each number is 0.428727 sec
10000000 has 29465738 iterations in sieve
Sum of primes obtained: 3203324994356
Time taken by checking primality of each number is 5.761439 sec
1000 has 831 iterations in sieve_linear
Sum of primes obtained: 76127
Time taken by checking primality of each number is 0.001069 sec
10000 has 8770 iterations in sieve_linear
Sum of primes obtained: 5736396
Time taken by checking primality of each number is 0.010398 sec
100000 has 90407 iterations in sieve_linear
Sum of primes obtained: 454396537
Time taken by checking primality of each number is 0.107276 sec
1000000 has 921501 iterations in sieve_linear
Sum of primes obtained: 37550402023
Time taken by checking primality of each number is 1.087080 sec
10000000 has 9335420 iterations in sieve_linear
Sum of primes obtained: 3203324994356
Time taken by checking primality of each number is 11.008726 sec
除了时间复杂度为O(N log log N)的埃拉托色尼筛法的简单实现之外,我还尝试实现了一个时间复杂度为O(N)的修改。虽然,两者都产生了预期的结果,但与下一个相比,前一个花费的时间要少得多,我无法弄清楚为什么。我真的很感激一些关于这方面的建议。
实施 1:
def build_sieve_eratosthenes(num):
## Creates sieve of size (num+1) to correct for 0-indexing.
primes_sieve = [1] * (num+1)
primes_sieve[0] = primes_sieve[1] = 0
for p in range(2, num):
if primes_sieve[p] == 1:
for mul in range(2*p, num+1, p):
primes_sieve[mul] = 0
return primes_sieve
实施 2:
def build_sieve_eratosthenes_linear(num):
## Creates sieve of size (num+1) to correct for 0-indexing.
primes_sieve = [1] * (num+1)
primes_sieve[0] = primes_sieve[1] = 0
## Builds a list of size (num+1) recording the smallest prime factor of each number.
SPF = [1] * (num+1)
## Builds a list of all primes seen so far with pos indicator of position where to insert the next prime.
## Uses a large fixed memory allocation scheme to avoid the usage of append list operation.
primes = [0] * num
pos = 0
for p in range(2, num+1):
if primes_sieve[p] == 1:
primes[pos] = p
pos = pos + 1
## Smallest prime factor of a prime is a prime itself.
SPF[p] = p
for i in range(0, pos):
if p * primes[i] <= num and primes[i] <= SPF[p]:
primes_sieve[p*primes[i]] = 0
SPF[p * primes[i]] = primes[i]
else:
break
return primes_sieve
test_num = 2000000
测试方法
def find_sum_of_primes_upto_num_with_sieve(sieve, num):
primes_sieve = sieve(num)
primes_sum = 0
for n in range(len(primes_sieve)):
if primes_sieve[n] == 1:
primes_sum = primes_sum + n
return primes_sum
结果:
start2 = time.time()
sum_2 = find_sum_of_primes_upto_num_with_sieve(build_sieve_eratosthenes, test_num)
end2 = time.time()
print("Sum of primes obtained: ", sum_2)
print("Time taken by checking primality of each number is %f sec" % (end2 - start2))
获得的素数和:142913828922
检查每个数字的素数所花费的时间是 0.647822 秒
start3 = time.time()
sum_3 = find_sum_of_primes_upto_num_with_sieve(build_sieve_eratosthenes_linear, test_num)
end3 = time.time()
print("Sum of primes obtained: ", sum_3)
print("Time taken by checking primality of each number is %f sec" % (end3 - start3))
获得的素数和:142913828922
检查每个数字的素数所花费的时间是 1.561308 秒
我在每个例程中放置了一个简单的迭代计数器,运行 从 10^3 到 10^7 的 10 的幂
build_sieve_eratosthenes:
1000 has 1958 iterations in sieve
10000 has 23071 iterations in sieve
100000 has 256808 iterations in sieve
1000000 has 2775210 iterations in sieve
10000000 has 29465738 iterations in sieve
build_sieve_eratosthenes_线性:
1000 has 831 iterations in sieve_linear
10000 has 8770 iterations in sieve_linear
100000 has 90407 iterations in sieve_linear
1000000 has 921501 iterations in sieve_linear
10000000 has 9335420 iterations in sieve_linear
你的 linear
函数是 不是 线性的:注意内部循环,它运行 pos
次......而 pos
是一个计算找到的素数数量,不是常数。
linear
比 "normal" 函数增长得更慢,并且总体迭代次数明显减少。但是,每次迭代的成本都更高,这就是您看到 "inverted" 次的原因。在您的 linear
函数中找到的每个数字和每个 "cross-out" 都更昂贵;较慢的增长并没有赶上你的 2*10^6 的限制,而不是我的 10*7 的限制。如果这对您来说值得的话,您可以将其推断出大约一天,以便更好地了解适当的时间……但是中央 "problem" 是每个数字的处理速度较慢。
对于细节好奇,这里是完整的输出:
1000 has 1958 iterations in sieve
Sum of primes obtained: 76127
Time taken by checking primality of each number is 0.000904 sec
10000 has 23071 iterations in sieve
Sum of primes obtained: 5736396
Time taken by checking primality of each number is 0.008270 sec
100000 has 256808 iterations in sieve
Sum of primes obtained: 454396537
Time taken by checking primality of each number is 0.067962 sec
1000000 has 2775210 iterations in sieve
Sum of primes obtained: 37550402023
Time taken by checking primality of each number is 0.428727 sec
10000000 has 29465738 iterations in sieve
Sum of primes obtained: 3203324994356
Time taken by checking primality of each number is 5.761439 sec
1000 has 831 iterations in sieve_linear
Sum of primes obtained: 76127
Time taken by checking primality of each number is 0.001069 sec
10000 has 8770 iterations in sieve_linear
Sum of primes obtained: 5736396
Time taken by checking primality of each number is 0.010398 sec
100000 has 90407 iterations in sieve_linear
Sum of primes obtained: 454396537
Time taken by checking primality of each number is 0.107276 sec
1000000 has 921501 iterations in sieve_linear
Sum of primes obtained: 37550402023
Time taken by checking primality of each number is 1.087080 sec
10000000 has 9335420 iterations in sieve_linear
Sum of primes obtained: 3203324994356
Time taken by checking primality of each number is 11.008726 sec