信息 C5012:由于“1008”原因,循环未并行化
info C5012: loop not parallelized due to reason '1008'
我正在 x86_64 上试用 Visual Studio 2013 的 Auto-Vectorizer 模式,我对以下内容感到有些惊讶。考虑天真的代码:
static void rescale( double * __restrict out, const int * __restrict in, long n, const double intercept, const double slope )
{
for( long i = 0; i < n; ++i )
out[i] = slope * in[i] + intercept;
}
Visual Studio returns 它在这样天真的例子中失败了:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(13) : info C5012: loop not parallelized due to reason '1008'
编译行在哪里(我现在只对SSE2感兴趣):
cl vec.c /O2 /Qpar /Qpar-report:2
查看文档:
导致:
读作:
The compiler detected that this loop does not perform enough work to
warrant auto-parallelization.
有没有办法重写这个循环,以便正确触发自动矢量化器模式?
我使用简单的方法重写代码失败:
static void rescale( double * __restrict out, const double * __restrict in, long n, const double intercept, const double slope )
{
for( long i = 0; i < n; ++i )
out[i] = slope * in[i] + intercept;
}
在上述情况下Visual Studio仍然报告:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(13) : info C5012: loop not parallelized due to reason '1008'
我应该如何重写我的初始代码才能满足 Visual Studio 2013 的自动矢量化器模式?我想用 64 位双精度向量做 a * b + c
:SSE2
您提供给 MSDN 的第二个 link 包含如何强制编译器对循环进行矢量化的示例。
// You can resolve this by specifying the hint_parallel
// pragma. CAUTION -- if the loop does not perform
// enough work, parallelizing might cause a potentially
// large performance penalty.
// #pragma loop(hint_parallel(0)) // hint_parallel will force this through
for (int i=0; i<1000; ++i)
{
A[i] = A[i] + 1;
}
MSDN link you posted 底部附近的示例代码建议使用 hint_parallel
pragma:
void code_1008()
{
// Code 1008 is emitted when the compiler detects that
// this loop does not perform enough work to warrant
// auto-parallelization.
// You can resolve this by specifying the hint_parallel
// pragma. CAUTION -- if the loop does not perform
// enough work, parallelizing might cause a potentially
// large performance penalty.
// #pragma loop(hint_parallel(0)) // hint_parallel will force this through
for (int i=0; i<1000; ++i)
{
A[i] = A[i] + 1;
}
}
我正在 x86_64 上试用 Visual Studio 2013 的 Auto-Vectorizer 模式,我对以下内容感到有些惊讶。考虑天真的代码:
static void rescale( double * __restrict out, const int * __restrict in, long n, const double intercept, const double slope )
{
for( long i = 0; i < n; ++i )
out[i] = slope * in[i] + intercept;
}
Visual Studio returns 它在这样天真的例子中失败了:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(13) : info C5012: loop not parallelized due to reason '1008'
编译行在哪里(我现在只对SSE2感兴趣):
cl vec.c /O2 /Qpar /Qpar-report:2
查看文档:
导致:
读作:
The compiler detected that this loop does not perform enough work to warrant auto-parallelization.
有没有办法重写这个循环,以便正确触发自动矢量化器模式?
我使用简单的方法重写代码失败:
static void rescale( double * __restrict out, const double * __restrict in, long n, const double intercept, const double slope )
{
for( long i = 0; i < n; ++i )
out[i] = slope * in[i] + intercept;
}
在上述情况下Visual Studio仍然报告:
--- Analyzing function: rescale
c:\users\malat\autovec\vec.c(13) : info C5012: loop not parallelized due to reason '1008'
我应该如何重写我的初始代码才能满足 Visual Studio 2013 的自动矢量化器模式?我想用 64 位双精度向量做 a * b + c
:SSE2
您提供给 MSDN 的第二个 link 包含如何强制编译器对循环进行矢量化的示例。
// You can resolve this by specifying the hint_parallel
// pragma. CAUTION -- if the loop does not perform
// enough work, parallelizing might cause a potentially
// large performance penalty.
// #pragma loop(hint_parallel(0)) // hint_parallel will force this through
for (int i=0; i<1000; ++i)
{
A[i] = A[i] + 1;
}
MSDN link you posted 底部附近的示例代码建议使用 hint_parallel
pragma:
void code_1008()
{
// Code 1008 is emitted when the compiler detects that
// this loop does not perform enough work to warrant
// auto-parallelization.
// You can resolve this by specifying the hint_parallel
// pragma. CAUTION -- if the loop does not perform
// enough work, parallelizing might cause a potentially
// large performance penalty.
// #pragma loop(hint_parallel(0)) // hint_parallel will force this through
for (int i=0; i<1000; ++i)
{
A[i] = A[i] + 1;
}
}