如何优化我的代码,计算小于 200 万的所有总和
How to Optimise my code that computes the sum of all from less than 2 million
我从 Project Euler 开始尝试这个问题,我需要计算所有素数的总和,直到两百万。
这是我想出的解决方案 -
#include <stdio.h>
int main() {
long sum = 5; // Already counting 2 and 3 in my sum.
int i = 5; // Checking from 5
int count = 0;
while (i <= 2000000) {
count = 0;
for (int j = 3; j <= i / 2; j += 2) {
// Checking if i (starting from 5) is divisible from 3
if (i % j == 0) { // to i/2 and only checking for odd values of j
count = 1;
}
}
if (count == 0) {
sum += i;
}
i += 2;
}
printf("%ld ", sum);
}
运行 大约需要 480 秒,我想知道是否有更好的解决方案或提示来改进我的程序。
________________________________________________________
Executed in 480.95 secs fish external
usr time 478.54 secs 0.23 millis 478.54 secs
sys time 1.28 secs 6.78 millis 1.28 secs
通过两个小的修改,您的代码变得更快:
#include <stdio.h>
#include <math.h>
int main() {
long long sum = 5; // we need long long, long might not be enough
// depending on your platform
int i = 5;
int count = 0;
while (i <= 2000000) {
count = 0;
int limit = sqrt(i); // determine upper limit once and for all
for (int j = 3; j <= limit; j += 2) { // use upper limit sqrt(i) instead if i/2
if (i % j == 0) {
count = 1;
break; // break out from loop as soon
// as number is not prime
}
}
if (count == 0) {
sum += i;
}
i += 2;
}
printf("%lld ", sum); // we need %lld for long long
}
所有的解释都在评论里
但肯定有更好甚至更快的方法来做到这一点。
我在我 10 岁的 MacPro 上 运行 这个 2000 万 第一次素数花了大约 30 秒。
这个程序近乎即时(即使在调试...)200万的总和,2000万只需要一秒钟(Windows10, 10 years-old i7 @ 3.4 GHz,MSVC 2019)。
注意:没有时间设置我的 C 编译器,这就是 malloc
.
上有强制转换的原因
“大”优化是存储平方值AND素数,所以绝对没有测试不可能的除数。由于在给定的整数区间内不超过 1/10 的素数(启发式,健壮的代码应该测试它并在需要时重新分配 primes
数组),时间大大缩短。
#include <stdio.h>
#include <malloc.h>
#define LIMIT 2000000ul // Computation limit.
typedef struct {
unsigned long int p ; // Store a prime number.
unsigned long int sq ; // and its square.
} prime ;
int main() {
prime* primes = (prime*)malloc((LIMIT/10)*sizeof(*primes)) ; // Store found primes. Can quite safely use 1/10th of the whole computation limit.
unsigned long int primes_count=1 ;
unsigned long int i = 3 ;
unsigned long long int sum = 0 ;
unsigned long int j = 0 ;
int is_prime = 1 ;
// Feed the first prime, 2.
primes[0].p = 2 ;
primes[0].sq = 4 ;
sum = 2 ;
// Parse all numbers up to LIMIT, ignoring even numbers.
// Also reset the "is_prime" flag at each loop.
for (i = 3 ; i <= LIMIT ; i+=2, is_prime = 1 ) {
// Parse all previously found primes.
for (j = 0; j < primes_count; j++) {
// Above sqrt(i)? Break, i is a prime.
if (i<primes[j].sq)
break ;
// Found a divisor? Not a prime (and break).
if ((i % primes[j].p == 0)) {
is_prime = 0 ;
break ;
}
}
// Add the prime and its square to the array "primes".
if (is_prime) {
primes[primes_count].p = i ;
primes[primes_count++].sq = i*i ;
// Compute the sum on-the-fly
sum += i ;
}
}
printf("Sum of all %lu primes: %llu\n", primes_count, sum);
free(primes) ;
}
您的程序可以通过提前停止内部循环来轻松改进:
- 当
i
超过 sqrt(j)
时。
- 找到除数时。
另请注意,类型 long
可能不足以满足所有架构的总和。 long long
推荐。
这是修改后的版本:
#include <stdio.h>
int main() {
long long sum = 5; // Already counting 2 and 3 in my sum.
long i = 5; // Checking from 5
while (i <= 2000000) {
int count = 0;
for (int j = 3; j * j <= i; j += 2) {
// Checking if i (starting from 5) is divisible from 3
if (i % j == 0) { // to i/2 and only checking for odd values of j
count = 1;
break;
}
}
if (count == 0) {
sum += i;
}
i += 2;
}
printf("%lld\n", sum);
}
这个简单的改变大大减少了运行时间!它比 1000 倍 快 2000000:
chqrlie> time ./primesum
142913828922
real 0m0.288s
user 0m0.264s
sys 0m0.004s
但是请注意,试验除法的效率远低于埃拉托色尼的经典筛法。
这是一个简单的版本:
#include <stdio.h>
#include <stdlib.h>
int main() {
long max = 2000000;
long long sum = 0;
// Allocate an array of indicators initialized to 0
unsigned char *composite = calloc(1, max + 1);
// For all numbers up to sqrt(max)
for (long i = 2; i * i <= max; i++) {
// It the number is a prime
if (composite[i] == 0) {
// Set all multiples as composite. Multiples below the
// square of i are skipped because they have already been
// set as multiples of a smaller prime.
for (long j = i * i; j <= max; j += i) {
composite[j] = 1;
}
}
}
for (long i = 2; i <= max; i++) {
if (composite[i] == 0)
sum += i;
}
printf("%lld\n", sum);
free(composite);
return 0;
}
此代码是 2000000 的另一个 20 倍:
chqrlie> time ./primesum-sieve
142913828922
real 0m0.014s
user 0m0.007s
sys 0m0.002s
对于更大的边界,可以通过多种方式进一步改进筛法。
我从 Project Euler 开始尝试这个问题,我需要计算所有素数的总和,直到两百万。
这是我想出的解决方案 -
#include <stdio.h>
int main() {
long sum = 5; // Already counting 2 and 3 in my sum.
int i = 5; // Checking from 5
int count = 0;
while (i <= 2000000) {
count = 0;
for (int j = 3; j <= i / 2; j += 2) {
// Checking if i (starting from 5) is divisible from 3
if (i % j == 0) { // to i/2 and only checking for odd values of j
count = 1;
}
}
if (count == 0) {
sum += i;
}
i += 2;
}
printf("%ld ", sum);
}
运行 大约需要 480 秒,我想知道是否有更好的解决方案或提示来改进我的程序。
________________________________________________________
Executed in 480.95 secs fish external
usr time 478.54 secs 0.23 millis 478.54 secs
sys time 1.28 secs 6.78 millis 1.28 secs
通过两个小的修改,您的代码变得更快:
#include <stdio.h>
#include <math.h>
int main() {
long long sum = 5; // we need long long, long might not be enough
// depending on your platform
int i = 5;
int count = 0;
while (i <= 2000000) {
count = 0;
int limit = sqrt(i); // determine upper limit once and for all
for (int j = 3; j <= limit; j += 2) { // use upper limit sqrt(i) instead if i/2
if (i % j == 0) {
count = 1;
break; // break out from loop as soon
// as number is not prime
}
}
if (count == 0) {
sum += i;
}
i += 2;
}
printf("%lld ", sum); // we need %lld for long long
}
所有的解释都在评论里
但肯定有更好甚至更快的方法来做到这一点。
我在我 10 岁的 MacPro 上 运行 这个 2000 万 第一次素数花了大约 30 秒。
这个程序近乎即时(即使在调试...)200万的总和,2000万只需要一秒钟(Windows10, 10 years-old i7 @ 3.4 GHz,MSVC 2019)。
注意:没有时间设置我的 C 编译器,这就是 malloc
.
“大”优化是存储平方值AND素数,所以绝对没有测试不可能的除数。由于在给定的整数区间内不超过 1/10 的素数(启发式,健壮的代码应该测试它并在需要时重新分配 primes
数组),时间大大缩短。
#include <stdio.h>
#include <malloc.h>
#define LIMIT 2000000ul // Computation limit.
typedef struct {
unsigned long int p ; // Store a prime number.
unsigned long int sq ; // and its square.
} prime ;
int main() {
prime* primes = (prime*)malloc((LIMIT/10)*sizeof(*primes)) ; // Store found primes. Can quite safely use 1/10th of the whole computation limit.
unsigned long int primes_count=1 ;
unsigned long int i = 3 ;
unsigned long long int sum = 0 ;
unsigned long int j = 0 ;
int is_prime = 1 ;
// Feed the first prime, 2.
primes[0].p = 2 ;
primes[0].sq = 4 ;
sum = 2 ;
// Parse all numbers up to LIMIT, ignoring even numbers.
// Also reset the "is_prime" flag at each loop.
for (i = 3 ; i <= LIMIT ; i+=2, is_prime = 1 ) {
// Parse all previously found primes.
for (j = 0; j < primes_count; j++) {
// Above sqrt(i)? Break, i is a prime.
if (i<primes[j].sq)
break ;
// Found a divisor? Not a prime (and break).
if ((i % primes[j].p == 0)) {
is_prime = 0 ;
break ;
}
}
// Add the prime and its square to the array "primes".
if (is_prime) {
primes[primes_count].p = i ;
primes[primes_count++].sq = i*i ;
// Compute the sum on-the-fly
sum += i ;
}
}
printf("Sum of all %lu primes: %llu\n", primes_count, sum);
free(primes) ;
}
您的程序可以通过提前停止内部循环来轻松改进:
- 当
i
超过sqrt(j)
时。 - 找到除数时。
另请注意,类型 long
可能不足以满足所有架构的总和。 long long
推荐。
这是修改后的版本:
#include <stdio.h>
int main() {
long long sum = 5; // Already counting 2 and 3 in my sum.
long i = 5; // Checking from 5
while (i <= 2000000) {
int count = 0;
for (int j = 3; j * j <= i; j += 2) {
// Checking if i (starting from 5) is divisible from 3
if (i % j == 0) { // to i/2 and only checking for odd values of j
count = 1;
break;
}
}
if (count == 0) {
sum += i;
}
i += 2;
}
printf("%lld\n", sum);
}
这个简单的改变大大减少了运行时间!它比 1000 倍 快 2000000:
chqrlie> time ./primesum
142913828922
real 0m0.288s
user 0m0.264s
sys 0m0.004s
但是请注意,试验除法的效率远低于埃拉托色尼的经典筛法。
这是一个简单的版本:
#include <stdio.h>
#include <stdlib.h>
int main() {
long max = 2000000;
long long sum = 0;
// Allocate an array of indicators initialized to 0
unsigned char *composite = calloc(1, max + 1);
// For all numbers up to sqrt(max)
for (long i = 2; i * i <= max; i++) {
// It the number is a prime
if (composite[i] == 0) {
// Set all multiples as composite. Multiples below the
// square of i are skipped because they have already been
// set as multiples of a smaller prime.
for (long j = i * i; j <= max; j += i) {
composite[j] = 1;
}
}
}
for (long i = 2; i <= max; i++) {
if (composite[i] == 0)
sum += i;
}
printf("%lld\n", sum);
free(composite);
return 0;
}
此代码是 2000000 的另一个 20 倍:
chqrlie> time ./primesum-sieve
142913828922
real 0m0.014s
user 0m0.007s
sys 0m0.002s
对于更大的边界,可以通过多种方式进一步改进筛法。