_mm_fmadd_pd 程序收到信号 SIGILL,非法指令
_mm_fmadd_pd Program received signal SIGILL, Illegal instruction
我收到以下代码的奇怪错误:
#include <assert.h>
#include <stdio.h>
#include <immintrin.h>
inline static double myfma(double x,double y, double z) {
double r; // result
__m128d xx, yy, zz,rr;
xx = _mm_set_sd(x);// xx[0]=x, xx[1]=undefined
yy = _mm_set_sd(y);// yy[0]=y, yy[1]=undefined
zz = _mm_set_sd(z);// zz[0]=z, zz[1]=undefined
r = _mm_cvtsd_f64(_mm_fmadd_pd(xx,yy,zz));
return r;
}
void testfma() {
double x, y, z, res;
x = 1.0;
y = 2.0;
z = 3.0;
res = myfma(x,y,z);
printf("test: res = x*y + z \n");
printf(" x: %g\n", x);
printf(" y: %g\n", y);
printf(" z: %g\n", z);
assert(res == 5.0);
}
int main() {
testfma();
return 0;
}
将代码编译为:
g++ test.cpp -o a.out -std=c++11 -mavx2 -mfma -march=native -g
当我 运行 可执行文件时,我收到消息:
Illegal instruction (core dumped)
使用 gdb 以获得更多详细信息:
gdb ./a.out
(gdb) r
(gdb) r
Starting program: ....
Program received signal SIGILL, Illegal instruction.
0x000000000040067d in _mm_fmadd_pd(double __vector(2), double __vector(2), double __vector(2)) (__C=..., __B=..., __A=...)
at /usr/lib/gcc/x86_64-linux-gnu/5/include/fmaintrin.h:42
42 (__v2df)__C);
然而,当使用 valgrind 时如下:
valgrind ./a.out
==9825== Memcheck, a memory error detector
==9825== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==9825== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
info
==9825== Command: ./helios.x
==9825==
test: res = x*y + z
x: 1
y: 2
z: 3
res: 5
==9825==
==9825== HEAP SUMMARY:
==9825== in use at exit: 0 bytes in 0 blocks
==9825== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==9825==
==9825== All heap blocks were freed -- no leaks are possible
==9825==
==9825== For counts of detected and suppressed errors, rerun with: -v
==9825== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
该程序似乎可以运行。我在这里缺少什么?如何以可靠的方式使用 _mm_fmadd_pd?无论 运行 在 Intel 或 AMD 处理器中使用,都可以使示例正常工作吗?无论使用 g++ 还是 icpc 都可以编译吗?
我猜你的 CPU 不支持 FMA 指令。它在 valgrind 下不会失败的原因是因为 valgrind 可以模拟某些指令。
您可能要考虑使用 std::fma
if you only want SISD. With gcc it generates an inline FMA instruction, but if you compile for a non-FMA target then it will fall back to a non-FMA implementation。
我收到以下代码的奇怪错误:
#include <assert.h>
#include <stdio.h>
#include <immintrin.h>
inline static double myfma(double x,double y, double z) {
double r; // result
__m128d xx, yy, zz,rr;
xx = _mm_set_sd(x);// xx[0]=x, xx[1]=undefined
yy = _mm_set_sd(y);// yy[0]=y, yy[1]=undefined
zz = _mm_set_sd(z);// zz[0]=z, zz[1]=undefined
r = _mm_cvtsd_f64(_mm_fmadd_pd(xx,yy,zz));
return r;
}
void testfma() {
double x, y, z, res;
x = 1.0;
y = 2.0;
z = 3.0;
res = myfma(x,y,z);
printf("test: res = x*y + z \n");
printf(" x: %g\n", x);
printf(" y: %g\n", y);
printf(" z: %g\n", z);
assert(res == 5.0);
}
int main() {
testfma();
return 0;
}
将代码编译为:
g++ test.cpp -o a.out -std=c++11 -mavx2 -mfma -march=native -g
当我 运行 可执行文件时,我收到消息:
Illegal instruction (core dumped)
使用 gdb 以获得更多详细信息:
gdb ./a.out
(gdb) r
(gdb) r
Starting program: ....
Program received signal SIGILL, Illegal instruction.
0x000000000040067d in _mm_fmadd_pd(double __vector(2), double __vector(2), double __vector(2)) (__C=..., __B=..., __A=...)
at /usr/lib/gcc/x86_64-linux-gnu/5/include/fmaintrin.h:42
42 (__v2df)__C);
然而,当使用 valgrind 时如下:
valgrind ./a.out
==9825== Memcheck, a memory error detector
==9825== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==9825== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright
info
==9825== Command: ./helios.x
==9825==
test: res = x*y + z
x: 1
y: 2
z: 3
res: 5
==9825==
==9825== HEAP SUMMARY:
==9825== in use at exit: 0 bytes in 0 blocks
==9825== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==9825==
==9825== All heap blocks were freed -- no leaks are possible
==9825==
==9825== For counts of detected and suppressed errors, rerun with: -v
==9825== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
该程序似乎可以运行。我在这里缺少什么?如何以可靠的方式使用 _mm_fmadd_pd?无论 运行 在 Intel 或 AMD 处理器中使用,都可以使示例正常工作吗?无论使用 g++ 还是 icpc 都可以编译吗?
我猜你的 CPU 不支持 FMA 指令。它在 valgrind 下不会失败的原因是因为 valgrind 可以模拟某些指令。
您可能要考虑使用 std::fma
if you only want SISD. With gcc it generates an inline FMA instruction, but if you compile for a non-FMA target then it will fall back to a non-FMA implementation。