为什么欧氏距离函数在 c 中比在 java 中慢?
Why is the euclidean distance function slower in c than in java?
我使用以下代码在 c 和 java 中实现并标记了以下函数。对于 c,我得到大约 1.688852 秒,而对于 java,它只需要 0.355038 秒。即使我删除 sqrt
函数,手动内联代码或更改函数签名以接受 6 double
坐标(以避免通过指针访问),因为 c 时间流逝也没有太大改善。
我正在像 cc -O2 main.c -lm
一样编译 c 程序。对于 java,我 运行 intellij idea 中的应用程序具有默认的 jvm 选项(java 8,openjdk)。
c:
#include <math.h>
#include <stdio.h>
#include <time.h>
typedef struct point3d
{
double x;
double y;
double z;
} point3d_t;
double distance(point3d_t *from, point3d_t *to);
int main(int argc, char const *argv[])
{
point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};
point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};
double time = 0.0;
int count = 10000000;
for (size_t i = 0; i < count; i++)
{
clock_t tic = clock();
double d = distance(&from, &to);
clock_t toc = clock();
time += ((double) (toc - tic) / CLOCKS_PER_SEC);
}
printf("Elapsed: %f seconds\n", time);
return 0;
}
double distance(point3d_t *from, point3d_t *to)
{
double dx = to->x - from->x;
double dy = to->y - from->y;
double dz = to->z - from->z;
double d2 = (dx * dx) + (dy * dy) + (dz + dz);
return sqrt(d2);
}
java:
public class App
{
static Random rnd = new Random();
public static void main( String[] args )
{
var sw = new StopWatch();
var time = 0.0;
var count = 10000000;
for (int i = 0; i < count; i++) {
var from = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
var to = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
sw.start();
var dist = distance(from, to);
sw.stop();
time += sw.getTime(TimeUnit.NANOSECONDS);
sw.reset();
}
System.out.printf("Time: %f seconds\n", time / 1e09);
}
public static double distance(Vector3D from, Vector3D to) {
var dx = to.getX() - from.getX();
var dy = to.getY() - from.getY();
var dz = to.getZ() - from.getZ();
return Math.sqrt((dx * dx) + (dy * dy) + (dz * dz));
}
}
我的objective是为了理解为什么c程序比java程序慢,让它工作得更快。
编辑:我在 java 程序中使用随机值来尝试确保 jvm 不会做任何有趣的事情,比如缓存结果和完全回避计算。
编辑:更新 c 的两个片段以使用 clock_gettime()
,记录所有循环而不是方法调用所花费的时间,并且不丢弃方法调用的结果:
#include <math.h>
#include <stdio.h>
#include <time.h>
typedef struct point3d
{
double x;
double y;
double z;
} point3d_t;
double distance(point3d_t *from, point3d_t *to);
int main(int argc, char const *argv[])
{
point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};
point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};
struct timespec fs;
struct timespec ts;
long time = 0;
int count = 10000000;
double dist = 0;
clock_gettime(CLOCK_REALTIME, &fs);
for (size_t i = 0; i < count; i++)
{
dist = distance(&from, &to);
}
clock_gettime(CLOCK_REALTIME, &ts);
time = ts.tv_nsec - fs.tv_nsec;
if (dist == 0.001)
{
printf("hello\n");
}
printf("Elapsed: %f sec\n", (double) time / 1e9);
return 0;
}
double distance(point3d_t *from, point3d_t *to)
{
double dx = to->x - from->x;
double dy = to->y - from->y;
double dz = to->z - from->z;
double d2 = (dx * dx) + (dy * dy) + (dz + dz);
return sqrt(d2);
}
java:
public class App
{
static Random rnd = new Random();
public static void main( String[] args )
{
var from = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
var to = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
var time = 0.0;
var count = 10000000;
double dist = 0.0;
var start = System.nanoTime();
for (int i = 0; i < count; i++) {
dist = distance(from, to);
}
var end = System.nanoTime();
time = end - start;
if (dist == rnd.nextDouble()) {
System.out.printf("hello! %f\n", dist);
}
dist = dist + 1;
System.out.printf("Time: %f sec\n", (double) time / 1e9);
System.out.printf("Yohoo! %f\n", dist);
}
public static double distance(Vector3D from, Vector3D to) {
var dx= to.getX() - from.getX();
var dy = to.getY() - from.getY();
var dz = to.getZ() - from.getZ();
return Math.sqrt((dx * dx) + (dy * dy) + (dz * dz));
}
}
使用 gcc -Wall -std=gnu99 -O2 main.c -lm
编译 c 代码。现在的结果对于 c 代码是 0.06323 秒,对于 java.
是 0.006325 秒
编辑:正如 Jérôme Richard 和 Peter Cordes 指出的那样,我的基准测试有缺陷,更不用说我在 c 版本中采用了负数的平方。因此,一旦我用 -fno-math-errno
编译了 c 程序,它的时钟就为 0 秒。我编译了像 gcc -O2 -std=gnu99 main.c -lm
这样的 c 程序。现在 c 程序有效计时为零秒 (271 ns),而 java 计时为 0.007217 秒。一切都井井有条:)
最终代码如下:
c:
#include <math.h>
#include <stdio.h>
#include <time.h>
typedef struct point3d
{
double x;
double y;
double z;
} point3d_t;
double distance(point3d_t *from, point3d_t *to);
int main(int argc, char const *argv[])
{
point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};
point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};
struct timespec fs;
struct timespec ts;
long time = 0;
int count = 10000000;
double dist = 0;
clock_gettime(CLOCK_REALTIME, &fs);
for (size_t i = 0; i < count; i++)
{
dist = distance(&from, &to);
}
clock_gettime(CLOCK_REALTIME, &ts);
time = ts.tv_nsec - fs.tv_nsec;
printf("hello %f \n", dist);
printf("Elapsed: %f ns\n", (double) time);
printf("Elapsed: %f sec\n", (double) time / 1e9);
return 0;
}
double distance(point3d_t *from, point3d_t *to)
{
double dx = (to->x) - (from->x);
double dy = (to->y) - (from->y);
double dz = (to->z) - (from->z);
double d2 = (dx * dx) + (dy * dy) + (dz * dz);
return sqrt(d2);
}
java:
public class App
{
static Random rnd = new Random();
public static void main( String[] args )
{
var from = Vector3D.of(2.3, 3.45, 4.56);
var to = Vector3D.of(5.678, 3.45, -9.0781);
var time = 0.0;
var count = 10000000;
double dist = 0.0;
var start = System.nanoTime();
for (int i = 0; i < count; i++) {
dist = distance(from, to);
}
var end = System.nanoTime();
time = end - start;
System.out.printf("Yohoo! %f\n", dist);
System.out.printf("Time: %f ns\n", (double) time / 1e9);
}
public static double distance(Vector3D from, Vector3D to) {
var dx = to.getX() - from.getX();
var dy = to.getY() - from.getY();
var dz = to.getZ() - from.getZ();
var d2 = (dx * dx) + (dy * dy) + (dz * dz);
return Math.sqrt(d2);
}
}
首先,用于测量时间的方法非常不精确。当前的方法引入了可能比测量本身更大的巨大偏差。事实上,clock
在许多平台上都不是很精确(在我的机器上大约 1 毫秒,在几乎所有平台上通常不超过 1 微秒)。此外,不精确性被 10,000,000 次迭代大大放大。如果要精确测量循环,则需要将时钟调用移到循环外(如果可能use a more accurate measurement function)。
仍然,主要问题是函数结果未被使用,Java JIT 可以看到并部分优化它。 GCC 不能因为数学函数标准行为(errno
导致副作用在 Java 代码中不可用)。您可以使用命令行标志 禁用 errno 的使用 -fno-math-errno
。有了那个 GCC 现在可以完全优化函数(即删除函数调用)并且由此产生的时间要小得多。但是,基准测试存在缺陷,您可能不想对其进行测量。如果你想写一个正确的基准,你需要读取计算值。例如,你可以计算一个校验和,至少要检查结果是correct/equivalent.
我使用以下代码在 c 和 java 中实现并标记了以下函数。对于 c,我得到大约 1.688852 秒,而对于 java,它只需要 0.355038 秒。即使我删除 sqrt
函数,手动内联代码或更改函数签名以接受 6 double
坐标(以避免通过指针访问),因为 c 时间流逝也没有太大改善。
我正在像 cc -O2 main.c -lm
一样编译 c 程序。对于 java,我 运行 intellij idea 中的应用程序具有默认的 jvm 选项(java 8,openjdk)。
c:
#include <math.h>
#include <stdio.h>
#include <time.h>
typedef struct point3d
{
double x;
double y;
double z;
} point3d_t;
double distance(point3d_t *from, point3d_t *to);
int main(int argc, char const *argv[])
{
point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};
point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};
double time = 0.0;
int count = 10000000;
for (size_t i = 0; i < count; i++)
{
clock_t tic = clock();
double d = distance(&from, &to);
clock_t toc = clock();
time += ((double) (toc - tic) / CLOCKS_PER_SEC);
}
printf("Elapsed: %f seconds\n", time);
return 0;
}
double distance(point3d_t *from, point3d_t *to)
{
double dx = to->x - from->x;
double dy = to->y - from->y;
double dz = to->z - from->z;
double d2 = (dx * dx) + (dy * dy) + (dz + dz);
return sqrt(d2);
}
java:
public class App
{
static Random rnd = new Random();
public static void main( String[] args )
{
var sw = new StopWatch();
var time = 0.0;
var count = 10000000;
for (int i = 0; i < count; i++) {
var from = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
var to = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
sw.start();
var dist = distance(from, to);
sw.stop();
time += sw.getTime(TimeUnit.NANOSECONDS);
sw.reset();
}
System.out.printf("Time: %f seconds\n", time / 1e09);
}
public static double distance(Vector3D from, Vector3D to) {
var dx = to.getX() - from.getX();
var dy = to.getY() - from.getY();
var dz = to.getZ() - from.getZ();
return Math.sqrt((dx * dx) + (dy * dy) + (dz * dz));
}
}
我的objective是为了理解为什么c程序比java程序慢,让它工作得更快。
编辑:我在 java 程序中使用随机值来尝试确保 jvm 不会做任何有趣的事情,比如缓存结果和完全回避计算。
编辑:更新 c 的两个片段以使用 clock_gettime()
,记录所有循环而不是方法调用所花费的时间,并且不丢弃方法调用的结果:
#include <math.h>
#include <stdio.h>
#include <time.h>
typedef struct point3d
{
double x;
double y;
double z;
} point3d_t;
double distance(point3d_t *from, point3d_t *to);
int main(int argc, char const *argv[])
{
point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};
point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};
struct timespec fs;
struct timespec ts;
long time = 0;
int count = 10000000;
double dist = 0;
clock_gettime(CLOCK_REALTIME, &fs);
for (size_t i = 0; i < count; i++)
{
dist = distance(&from, &to);
}
clock_gettime(CLOCK_REALTIME, &ts);
time = ts.tv_nsec - fs.tv_nsec;
if (dist == 0.001)
{
printf("hello\n");
}
printf("Elapsed: %f sec\n", (double) time / 1e9);
return 0;
}
double distance(point3d_t *from, point3d_t *to)
{
double dx = to->x - from->x;
double dy = to->y - from->y;
double dz = to->z - from->z;
double d2 = (dx * dx) + (dy * dy) + (dz + dz);
return sqrt(d2);
}
java:
public class App
{
static Random rnd = new Random();
public static void main( String[] args )
{
var from = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
var to = Vector3D.of(rnd.nextDouble(), rnd.nextDouble(), rnd.nextDouble());
var time = 0.0;
var count = 10000000;
double dist = 0.0;
var start = System.nanoTime();
for (int i = 0; i < count; i++) {
dist = distance(from, to);
}
var end = System.nanoTime();
time = end - start;
if (dist == rnd.nextDouble()) {
System.out.printf("hello! %f\n", dist);
}
dist = dist + 1;
System.out.printf("Time: %f sec\n", (double) time / 1e9);
System.out.printf("Yohoo! %f\n", dist);
}
public static double distance(Vector3D from, Vector3D to) {
var dx= to.getX() - from.getX();
var dy = to.getY() - from.getY();
var dz = to.getZ() - from.getZ();
return Math.sqrt((dx * dx) + (dy * dy) + (dz * dz));
}
}
使用 gcc -Wall -std=gnu99 -O2 main.c -lm
编译 c 代码。现在的结果对于 c 代码是 0.06323 秒,对于 java.
编辑:正如 Jérôme Richard 和 Peter Cordes 指出的那样,我的基准测试有缺陷,更不用说我在 c 版本中采用了负数的平方。因此,一旦我用 -fno-math-errno
编译了 c 程序,它的时钟就为 0 秒。我编译了像 gcc -O2 -std=gnu99 main.c -lm
这样的 c 程序。现在 c 程序有效计时为零秒 (271 ns),而 java 计时为 0.007217 秒。一切都井井有条:)
最终代码如下:
c:
#include <math.h>
#include <stdio.h>
#include <time.h>
typedef struct point3d
{
double x;
double y;
double z;
} point3d_t;
double distance(point3d_t *from, point3d_t *to);
int main(int argc, char const *argv[])
{
point3d_t from = {.x = 2.3, .y = 3.45, .z = 4.56};
point3d_t to = {.x = 5.678, .y = 3.45, .z = -9.0781};
struct timespec fs;
struct timespec ts;
long time = 0;
int count = 10000000;
double dist = 0;
clock_gettime(CLOCK_REALTIME, &fs);
for (size_t i = 0; i < count; i++)
{
dist = distance(&from, &to);
}
clock_gettime(CLOCK_REALTIME, &ts);
time = ts.tv_nsec - fs.tv_nsec;
printf("hello %f \n", dist);
printf("Elapsed: %f ns\n", (double) time);
printf("Elapsed: %f sec\n", (double) time / 1e9);
return 0;
}
double distance(point3d_t *from, point3d_t *to)
{
double dx = (to->x) - (from->x);
double dy = (to->y) - (from->y);
double dz = (to->z) - (from->z);
double d2 = (dx * dx) + (dy * dy) + (dz * dz);
return sqrt(d2);
}
java:
public class App
{
static Random rnd = new Random();
public static void main( String[] args )
{
var from = Vector3D.of(2.3, 3.45, 4.56);
var to = Vector3D.of(5.678, 3.45, -9.0781);
var time = 0.0;
var count = 10000000;
double dist = 0.0;
var start = System.nanoTime();
for (int i = 0; i < count; i++) {
dist = distance(from, to);
}
var end = System.nanoTime();
time = end - start;
System.out.printf("Yohoo! %f\n", dist);
System.out.printf("Time: %f ns\n", (double) time / 1e9);
}
public static double distance(Vector3D from, Vector3D to) {
var dx = to.getX() - from.getX();
var dy = to.getY() - from.getY();
var dz = to.getZ() - from.getZ();
var d2 = (dx * dx) + (dy * dy) + (dz * dz);
return Math.sqrt(d2);
}
}
首先,用于测量时间的方法非常不精确。当前的方法引入了可能比测量本身更大的巨大偏差。事实上,clock
在许多平台上都不是很精确(在我的机器上大约 1 毫秒,在几乎所有平台上通常不超过 1 微秒)。此外,不精确性被 10,000,000 次迭代大大放大。如果要精确测量循环,则需要将时钟调用移到循环外(如果可能use a more accurate measurement function)。
仍然,主要问题是函数结果未被使用,Java JIT 可以看到并部分优化它。 GCC 不能因为数学函数标准行为(errno
导致副作用在 Java 代码中不可用)。您可以使用命令行标志 禁用 errno 的使用 -fno-math-errno
。有了那个 GCC 现在可以完全优化函数(即删除函数调用)并且由此产生的时间要小得多。但是,基准测试存在缺陷,您可能不想对其进行测量。如果你想写一个正确的基准,你需要读取计算值。例如,你可以计算一个校验和,至少要检查结果是correct/equivalent.