Java：Math.sqrt() 的 32 位 fp 实现

Question

标准 Math.sqrt() 方法在 Java 中看起来已经相当快了，但它有一个固有的缺点，即它总是涉及 64 位操作，这只会降低处理速度32 位 float 值。是否可以使用使用 float 作为参数的自定义方法做得更好，仅执行 32 位操作，结果 returns 为 float？

我看到了：

Fast sqrt in Java at the expense of accuracy

它只是强化了 Math.sqrt() 通常难以击败的观念。我还看到了：

http://www.codeproject.com/Articles/69941/Best-Square-Root-Method-Algorithm-Function-Precisi

它向我展示了一堆有趣的 C++/ASM hack，我太无知了，无法直接移植到 Java。尽管 sqrt14 作为 JNI 调用的一部分可能很有趣。 . .

我还查看了 Apache Commons FastMath，但看起来该库默认为标准 Math.sqrt()，所以没有帮助。然后是耶普！:

http://www.yeppp.info/

但我还没有为此烦恼。

Answer 1

你似乎知道 JNI：

只需从 C 的标准库 math.h 中为 double sqrt(double) 和 float sqrt(float) 编写一个最小包装器并比较性能。

提示：除非进行大量平方根运算，否则您不会感觉到有什么不同，然后使用 SIMD 指令一次进行多个平方根运算的性能优势很可能会主导效果。如果您使用 Java 标准库，您将需要从 Java 获取浮点值的内存对齐数组，这可能非常困难。

Answer 2

对于 32 位值，您不需要任何东西来加速 sqrt。 HotSpot JVM 会自动为您完成。

JIT 编译器足够聪明，可以识别 f2d -> Math.sqrt() -> d2f 模式并用更快的 sqrtss CPU 指令代替 sqrtsd。 The source.

基准：

@State(Scope.Benchmark)
public class Sqrt {
    double d = Math.random();
    float f = (float) d;

    @Benchmark
    public double sqrtD() {
        return Math.sqrt(d);
    }

    @Benchmark
    public float sqrtF() {
        return (float) Math.sqrt(f);
    }
}

结果：

Benchmark    Mode  Cnt       Score      Error   Units
Sqrt.sqrtD  thrpt    5  145501,072 ± 2211,666  ops/ms
Sqrt.sqrtF  thrpt    5  223657,110 ± 2268,735  ops/ms

Java：Math.sqrt() 的 32 位 fp 实现

Java: 32-bit fp implementation of Math.sqrt()

java

math

performance

32-bit