为什么 GCC 优化不适用于 valarrays？

Question

这是一个使用 valarrays 的简单 C++ 程序：

#include <iostream>
#include <valarray>

int main() {
    using ratios_t = std::valarray<float>;

    ratios_t a{0.5, 1, 2};
    const auto& res ( ratios_t::value_type(256) / a );
    for(const auto& r : ratios_t{res})
        std::cout << r << " " << std::endl;
    return 0;  
}

如果我编译运行它是这样的：

g++ -O0 main.cpp && ./a.out

输出符合预期：

512 256 128

但是，如果我编译运行它是这样的：

g++ -O3 main.cpp && ./a.out

输出为：

0 0 0

如果我使用 -O1 优化参数，也会发生同样的情况。

GCC 版本是（Archlinux 中的最新版本）：

$ g++ --version
g++ (GCC) 6.1.1 20160707

但是，如果我尝试使用 clang，两者都会

clang++ -std=gnu++14 -O0 main.cpp && ./a.out

和

clang++ -std=gnu++14 -O3 main.cpp && ./a.out

产生相同的正确结果：

512 256 128

Clang 版本是：

$ clang++ --version
clang version 3.8.0 (tags/RELEASE_380/final)

我也尝试过在 Debian 上使用 GCC 4.9.2，可执行文件产生了正确的结果。

这可能是 GCC 中的错误还是我做错了什么？任何人都可以复制这个吗？

编辑：我也在 Mac OS.

上的 GCC 6 的 Homebrew 版本上重现了这个问题

Answer 1

这是使用惰性求值粗心实施 operator/ (const T& val, const std::valarray<T>& rhs)（很可能是 valarrays 上的其他运算符）的结果：

#include <iostream>
#include <valarray>

int main() {
    using ratios_t = std::valarray<float>;

    ratios_t a{0.5, 1, 2};
    float x = 256;
    const auto& res ( x / a );
    // x = 512;  //  <-- uncommenting this line affects the output
    for(const auto& r : ratios_t{res})
        std::cout << r << " ";
    return 0;
}

注释掉“x = 512”行后，输出为

512 256 128

取消对该行的注释，输出变为

1024 512 256

由于在您的示例中除法运算符的左侧参数是临时的，因此结果未定义。

更新

与 Jonathan Wakely correctly 一样，由于 auto.

的使用，基于惰性求值的实现在此示例中成为一个问题

Answer 2

valarray 和 auto 混的不太好。

这将创建一个临时对象，然后对其应用 operator/：

const auto& res ( ratios_t::value_type(256) / a );

libstdc++ valarray 使用表达式模板，因此 operator/ returns 是一个轻量级对象，它引用原始参数并延迟计算它们。您使用 const auto& 导致表达式模板绑定到引用，但不会延长表达式模板引用的临时对象的生命周期，因此当评估发生时，临时对象已经超出范围，并且它的内存已被重用。

如果你这样做会很好：

ratios_t res = ratios_t::value_type(256) / a;

更新： 从今天开始，GCC trunk 将给出这个例子的预期结果。我修改了我们的 valarray 表达式模板，使其更不容易出错，因此更难（但并非不可能）创建悬空引用。新的实现应该包含在明年的 GCC 9 中。

为什么 GCC 优化不适用于 valarrays？

Why does GCC optimization not work with valarrays?

c++

gcc

g++

clang++