信息 C5012:由于原因“1007”,循环未并行化
info C5012: loop not parallelized due to reason ‘1007’
我正在 x86_64 上试用 Visual Studio 2013 的自动矢量化器模式,我对以下内容感到有些惊讶。考虑天真的代码:
static void rescale( double * __restrict out, unsigned short * __restrict in, size_t n, const double intercept, const double slope )
{
for( size_t i = 0; i < n; ++i )
out[i] = slope * in[i] + intercept;
}
Visual Studio returns 它在这样天真的例子中失败了:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(18) : info C5012: loop not parallelized due to reason '1007'
编译行在哪里(我现在只对SSE2感兴趣):
cl vec.c /O2 /Qpar /Qpar-report:2
查看文档:
导致:
读作:
The loop induction variable or the loop bounds are not signed 32-bit
numbers (int or long). Resolve this by changing the type of the
induction variable.
有没有办法重写这个循环,以便正确触发自动矢量化器模式?
我使用简单的方法重写代码失败:
static void rescale( double * __restrict out, unsigned short * __restrict in, size_t n, const double intercept, const double slope )
{
const long first = (long)n;
const long secnd = n > LONG_MAX ? n - LONG_MAX : 0;
for( long i = 0; i < first; ++i )
out[i] = slope * in[i] + intercept;
for( long i = 0; i < secnd; ++i )
out[LONG_MAX+i] = slope * in[LONG_MAX+i] + intercept;
}
在上述情况下 Visual Studio 现在报告:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(21) : info C5012: loop not parallelized due to reason '1000'
c:\users\malat\autovec\vec.c(23) : info C5012: loop not parallelized due to reason '1000'
这意味着:
The compiler detected a data dependency in the loop body.
在我的第二种情况下,我看不出哪里会有数据依赖性。
我应该如何重写我的初始代码才能满足 Visual Studio 2013 的自动矢量化器模式?
SSE2 及其前身 SSE 都没有将 uint16_t
-s 转换为 double
-s 的正确指令。
将in
转换为double*
。
我正在 x86_64 上试用 Visual Studio 2013 的自动矢量化器模式,我对以下内容感到有些惊讶。考虑天真的代码:
static void rescale( double * __restrict out, unsigned short * __restrict in, size_t n, const double intercept, const double slope )
{
for( size_t i = 0; i < n; ++i )
out[i] = slope * in[i] + intercept;
}
Visual Studio returns 它在这样天真的例子中失败了:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(18) : info C5012: loop not parallelized due to reason '1007'
编译行在哪里(我现在只对SSE2感兴趣):
cl vec.c /O2 /Qpar /Qpar-report:2
查看文档:
导致:
读作:
The loop induction variable or the loop bounds are not signed 32-bit numbers (int or long). Resolve this by changing the type of the induction variable.
有没有办法重写这个循环,以便正确触发自动矢量化器模式?
我使用简单的方法重写代码失败:
static void rescale( double * __restrict out, unsigned short * __restrict in, size_t n, const double intercept, const double slope )
{
const long first = (long)n;
const long secnd = n > LONG_MAX ? n - LONG_MAX : 0;
for( long i = 0; i < first; ++i )
out[i] = slope * in[i] + intercept;
for( long i = 0; i < secnd; ++i )
out[LONG_MAX+i] = slope * in[LONG_MAX+i] + intercept;
}
在上述情况下 Visual Studio 现在报告:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(21) : info C5012: loop not parallelized due to reason '1000'
c:\users\malat\autovec\vec.c(23) : info C5012: loop not parallelized due to reason '1000'
这意味着:
The compiler detected a data dependency in the loop body.
在我的第二种情况下,我看不出哪里会有数据依赖性。
我应该如何重写我的初始代码才能满足 Visual Studio 2013 的自动矢量化器模式?
SSE2 及其前身 SSE 都没有将 uint16_t
-s 转换为 double
-s 的正确指令。
将in
转换为double*
。