用于检查两个双精度值是否足够不同的 simd 代码
simd code to check if two double values are sufficiently different
假设我有两个双精度值,旧的和新的。
我想实现一个矢量化函数
如果 abs(x-y) < p,则 returns 旧,否则为新。
这是代码 (test.cpp):
#include <emmintrin.h>
#include <iostream>
#define ARRAY_LENGTH 2
int main(void) {
// x = old value, y = new value, res = result
double *x, *y, *res;
posix_memalign((void **)&x, 16, sizeof(double) * ARRAY_LENGTH);
posix_memalign((void **)&y, 16, sizeof(double) * ARRAY_LENGTH);
posix_memalign((void **)&res, 16, sizeof(double) * ARRAY_LENGTH);
double p = 1e-4; // precision
__m128d sp = _mm_set1_pd(p);
x[0] = 1.5; y[0] = 1.50011; // x - old value, y - new value
x[1] = 2.; y[1] = 2.0000001;
__m128d sx = _mm_load_pd(x);
__m128d sy = _mm_load_pd(y);
// sign mask to compute fabs()
__m128d sign_mask = _mm_set1_pd(-0.);
// |x-y|
__m128d absval = _mm_andnot_pd(sign_mask, _mm_sub_pd(sx, sy) );
// mask of |x-y| < p
__m128d mask = _mm_cmplt_pd(absval, sp);
// sres = |x-y| < p ? x : y;
__m128d sres = _mm_or_pd(
_mm_and_pd(mask, sx), _mm_andnot_pd(mask, sy) );
_mm_store_pd(res, sres);
std::cerr << "res=" << res[0] << "," << res[1] << std::endl;
return 0;
}
建造:
g++ -std=c++11 -msse4 test.cpp
我们首先计算 fabs(x-y),与 p 进行比较,然后使用
获得面具
有没有人看到更有效的编码方式?谢谢
有一种方法可以使该算法更快一些,但会降低准确性:
// d = x - y;
__m128d diff = _mm_sub_pd(sx, sy);
// mask of |y - x| < p
__m128d mask = _mm_cmplt_pd(_mm_andnot_pd(sign_mask, diff), sp);
// sres = y + (|y - x| < p) ? (x - y) : 0;
__m128d sres = _mm_add_pd(sy, _mm_and_pd(mask, diff));
另一种方式 - 使用 AVX or/and 单精度。
假设我有两个双精度值,旧的和新的。 我想实现一个矢量化函数 如果 abs(x-y) < p,则 returns 旧,否则为新。
这是代码 (test.cpp):
#include <emmintrin.h>
#include <iostream>
#define ARRAY_LENGTH 2
int main(void) {
// x = old value, y = new value, res = result
double *x, *y, *res;
posix_memalign((void **)&x, 16, sizeof(double) * ARRAY_LENGTH);
posix_memalign((void **)&y, 16, sizeof(double) * ARRAY_LENGTH);
posix_memalign((void **)&res, 16, sizeof(double) * ARRAY_LENGTH);
double p = 1e-4; // precision
__m128d sp = _mm_set1_pd(p);
x[0] = 1.5; y[0] = 1.50011; // x - old value, y - new value
x[1] = 2.; y[1] = 2.0000001;
__m128d sx = _mm_load_pd(x);
__m128d sy = _mm_load_pd(y);
// sign mask to compute fabs()
__m128d sign_mask = _mm_set1_pd(-0.);
// |x-y|
__m128d absval = _mm_andnot_pd(sign_mask, _mm_sub_pd(sx, sy) );
// mask of |x-y| < p
__m128d mask = _mm_cmplt_pd(absval, sp);
// sres = |x-y| < p ? x : y;
__m128d sres = _mm_or_pd(
_mm_and_pd(mask, sx), _mm_andnot_pd(mask, sy) );
_mm_store_pd(res, sres);
std::cerr << "res=" << res[0] << "," << res[1] << std::endl;
return 0;
}
建造:
g++ -std=c++11 -msse4 test.cpp
我们首先计算 fabs(x-y),与 p 进行比较,然后使用 获得面具
有没有人看到更有效的编码方式?谢谢
有一种方法可以使该算法更快一些,但会降低准确性:
// d = x - y;
__m128d diff = _mm_sub_pd(sx, sy);
// mask of |y - x| < p
__m128d mask = _mm_cmplt_pd(_mm_andnot_pd(sign_mask, diff), sp);
// sres = y + (|y - x| < p) ? (x - y) : 0;
__m128d sres = _mm_add_pd(sy, _mm_and_pd(mask, diff));
另一种方式 - 使用 AVX or/and 单精度。