严格别名、-ffast-math 和 SSE
Strict aliasing, -ffast-math and SSE
考虑以下程序:
#include <iostream>
#include <cmath>
#include <cstring>
#include <xmmintrin.h>
using namespace std;
int main()
{
// 4 float32s.
__m128 nans;
// Set them all to 0xffffffff which should be NaN.
memset(&nans, 0xff, 4*4);
// cmpord should return a mask of 0xffffffff for any non-NaNs, and 0x00000000 for NaNs.
__m128 mask = _mm_cmpord_ps(nans, nans);
// AND the mask with nans to zero any of the nans. The result should be 0x00000000 for every component.
__m128 z = _mm_and_ps(mask, nans);
cout << z[0] << " " << z[1] << " " << z[2] << " " << z[3] << endl;
return 0;
}
如果我使用 Apple Clang 7.0.2 进行编译,有无 -ffast-math
,我都会得到预期的输出 0 0 0 0
:
$ clang --version
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin14.5.0
Thread model: posix
$ clang test.cpp -o test
$ ./test
0 0 0 0
$ clang test.cpp -ffast-math -o test
$ ./test
0 0 0 0
但是在更新到 8.1.0 之后(抱歉,我不知道这对应于哪个实际版本的 Clang - Apple 不再发布该信息),-ffast-math
似乎打破了这个:
$ clang --version
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
$ clang test.cpp -o test
$ ./test
0 0 0 0
$ clang test.cpp -ffast-math -o test
$ ./test
nan nan nan nan
我怀疑这是因为严格的别名规则或类似的原因。谁能解释这种行为?
编辑:我忘了说如果你这样做 nans = { std::nanf(nullptr), ...
它工作正常。
同时查看 godbolt 似乎 Clang 3.8.1 和 Clang 3.9 之间的行为发生了变化 - 后者删除了 cmpordps
指令。 GCC 7.1 似乎保留了它。
这不是一个严格的别名问题。如果您阅读 the documentation of -ffast-math
,您会看到您的问题:
Enable fast-math mode. This defines the __FAST_MATH__
preprocessor macro, and lets the compiler make aggressive, potentially-lossy assumptions about floating-point math. These include:
- [...]
- operands to floating-point operations are not equal to
NaN
and Inf
, and
- [...]
-ffast-math
允许编译器假定浮点数永远不会是 NaN
(因为它设置了 -ffinite-math-only
选项)。由于 clang 试图匹配 gcc 的选项,我们可以从 GCC's option documentation 中阅读一些内容以更好地理解 -ffinite-math-only
的作用:
Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs.
This option should never be turned on by any -O option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications.
因此,如果您的代码需要使用 NaN
,则不能使用 -ffast-math
或 -ffinite-math-only
。否则你 运行 优化器破坏你的代码的风险,正如你在这里看到的那样。
考虑以下程序:
#include <iostream>
#include <cmath>
#include <cstring>
#include <xmmintrin.h>
using namespace std;
int main()
{
// 4 float32s.
__m128 nans;
// Set them all to 0xffffffff which should be NaN.
memset(&nans, 0xff, 4*4);
// cmpord should return a mask of 0xffffffff for any non-NaNs, and 0x00000000 for NaNs.
__m128 mask = _mm_cmpord_ps(nans, nans);
// AND the mask with nans to zero any of the nans. The result should be 0x00000000 for every component.
__m128 z = _mm_and_ps(mask, nans);
cout << z[0] << " " << z[1] << " " << z[2] << " " << z[3] << endl;
return 0;
}
如果我使用 Apple Clang 7.0.2 进行编译,有无 -ffast-math
,我都会得到预期的输出 0 0 0 0
:
$ clang --version
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin14.5.0
Thread model: posix
$ clang test.cpp -o test
$ ./test
0 0 0 0
$ clang test.cpp -ffast-math -o test
$ ./test
0 0 0 0
但是在更新到 8.1.0 之后(抱歉,我不知道这对应于哪个实际版本的 Clang - Apple 不再发布该信息),-ffast-math
似乎打破了这个:
$ clang --version
Apple LLVM version 8.1.0 (clang-802.0.42)
Target: x86_64-apple-darwin16.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
$ clang test.cpp -o test
$ ./test
0 0 0 0
$ clang test.cpp -ffast-math -o test
$ ./test
nan nan nan nan
我怀疑这是因为严格的别名规则或类似的原因。谁能解释这种行为?
编辑:我忘了说如果你这样做 nans = { std::nanf(nullptr), ...
它工作正常。
同时查看 godbolt 似乎 Clang 3.8.1 和 Clang 3.9 之间的行为发生了变化 - 后者删除了 cmpordps
指令。 GCC 7.1 似乎保留了它。
这不是一个严格的别名问题。如果您阅读 the documentation of -ffast-math
,您会看到您的问题:
Enable fast-math mode. This defines the
__FAST_MATH__
preprocessor macro, and lets the compiler make aggressive, potentially-lossy assumptions about floating-point math. These include:
- [...]
- operands to floating-point operations are not equal to
NaN
andInf
, and- [...]
-ffast-math
允许编译器假定浮点数永远不会是 NaN
(因为它设置了 -ffinite-math-only
选项)。由于 clang 试图匹配 gcc 的选项,我们可以从 GCC's option documentation 中阅读一些内容以更好地理解 -ffinite-math-only
的作用:
Allow optimizations for floating-point arithmetic that assume that arguments and results are not NaNs or +-Infs.
This option should never be turned on by any -O option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications.
因此,如果您的代码需要使用 NaN
,则不能使用 -ffast-math
或 -ffinite-math-only
。否则你 运行 优化器破坏你的代码的风险,正如你在这里看到的那样。