编译器优化打破了惰性迭代器

Compiler optimization breaks lazy iterator

我用自定义迭代器编写了一个自定义容器。由于容器的特殊特性,必须延迟计算迭代器。为了这个问题,代码的相关部分是以这种方式实现的迭代器的取消引用运算符

template<typename T>
struct Container
{
  vector<T> m_Inner;

  // This should calculate the appropriate value.
  // In this example is taken from a vec but in 
  //the real use-case is calculated on request
  T Value(int N)
  { m_Inner.at(N); }
}

template<typename T>
struct Lazy_Iterator
{
  mutable pair<int, T> m_Current;
  int Index
  Container<T>* C

  Lazy_Iterator(const Container& Cont, int N):
    m_Current{Index, T{}}, Index{N}, C{&Cont}
  {      }

  pair<int, T>&
  operator*() const // __attribute__((noinline)) (this cures the symptom)
  {
      m_Current.first = Index; /// Optimized out
      m_Current.second = C->Value(Index); /// Optimized out
      return m_Current;
  }

}

因为迭代器本身就是一个模板,它的函数可以被编译器自由内联。

当我在没有优化的情况下编译代码时,returned 值会按预期更新。当我在某些情况下使用发布编译器优化(GCC 4.9 中的 -O2)时,编译器优化了我标记为 Optimized out 的行,即使 m_Current 成员被标记为作为可变的。因此,return 值与迭代器应指向的值不匹配。

这是预期的行为吗?您是否知道任何可移植的方法来指定该函数的内容即使被标记为 const 也应该被评估?

我希望这个问题足够详尽以便有用。如果更多详细信息对这种情况有帮助,请提出建议。

编辑:

回答一个评论,这是从一个小测试程序中提取的潜在用法:

Container<double> myC;
Lazy_Iterator<double> It{myC, 0}
cout << "Creation: " << it->first << " , " << it->second << endl;

auto it2 = it;
cout << "Copy: "<<  it2->first << " , " << it2->second << endl;

cout << "Pre-increment: " << (it++)->first << " , " << it->second << endl;
cout << "Post-increment: " << (++it)->first << " , " << it->second << endl;
cout << "Pre-decrement: " << (it--)->first << " , " << it->second << endl;
cout << "Post-decrement: " << (--it)->first << " , " << it->second << endl;
cout << "Iterator addition: " << (it+2)->first << " , " << (it+2)->second << endl;
cout << "Iterator subtraction: "<< (it-2)->first << " , " << (it-2)->second << endl;

reverse_iterator<Lazy_Iterator> rit{it};
cout << "Reverse Iterator: " << rit->first << " , " << rit->second << endl;

auto rit2 = rit;
cout << "Reverse Iterator copy: " << rit2->first << " , " << rit2->second << endl;

cout << "Rev Pre-increment: " << (rit++)->first << " , " << rit->second << endl;
cout << "Rev Post-increment: " << (++rit)->first << " , " << rit->second << endl;
cout << "Rev Pre-decrement: " << (rit--)->first << " , " << rit->second << endl;
cout << "Rev Post-decrement: " << (--rit)->first << " , " << rit->second << endl;
cout << "Rev Iterator addition: " << (rit+2)->first << " , " << (rit+2)->second << endl;
cout << "Rev Iterator subtraction: "<< (rit-2)->first << " , " << (rit-2)->second << endl;

除最后两行外,所有测试的测试结果都符合预期

最后两行测试在开启优化的情况下崩溃

该系统实际上运行良好,并不比任何其他迭代器更危险。当然,如果容器在他的眼皮子底下被删除,它会失败,通过复制使用 returned 值可能更安全,而不仅仅是保留引用,但这是题外话

如果你必须 post 一个可重现该问题的可编译片段(实际上我无法用 GCC 4.9 重现它)我认为你有未定义的行为并且是由 O2 触发的(O2 启用优化这可能会破坏未定义的行为)。你应该有一个指向

的指针
Container<T> 

在迭代器内部。

无论如何请注意惰性迭代器破坏了 std 迭代器的契约,我认为一个更好的选择是制作一个常规容器惰性值,你可以这样跳到一起创建自定义容器和迭代器 ;)(查看代理模式)。

"Optimized out even though the m_Current member is marked as mutable"

这告诉我您假设优化器关心 mutable。它没有。 constmutable 已被较早的编译阶段删除。

为什么优化器会删除这两个语句,如果它们是内联的?我怀疑在内联之后,优化器可以证明这两个写入是空操作,因为 m_Current 变量必须已经持有正确的值, 因为后续m_Current 的用法使它没有实际意义。以下情况通常会使这些写入成为空操作:

Lazy_Iterator LI = foo(); // Theoretically writes
*LI = bar(); // Overwrites the previous value.

reverse_iterator 持有的物理迭代器(.base() 返回的)和它指向的逻辑值之间存在差异:它们差一。 reverse_iterator might do return *(--internal_iterator); on dereference,这给你留下了对被破坏的局部函数临时内部结构的悬空引用。

再次阅读标准后,我发现它有额外的要求来避免这种情况,请阅读注释。

我还发现 GCC 4.9 标准库不兼容。它使用一个临时的。所以,我认为这是一个 GCC 错误。

编辑:标准报价

24.5.1.3.4 operator*     [reverse.iter.op.star]

reference operator*() const;

1 Effects:

deref_tmp = current;  
--deref_tmp; 
return *deref_tmp;

2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) —end note ]

后续阅读: Library Defect Report 198.

it seems that it is returned to old behaviour.

后期编辑:P0031 在 C++17 工作草案中投票。它声明 reverse_iterator 使用临时的,而不是成员来保存中间值。

经过一轮非常有利可图的讨论后,Revolver_Ocelot 的回答让我进一步关注 reverse_iterators 的实施。根据他对标准的引用:

24.5.1.3.4 operator* [reverse.iter.op.star]

reference operator*() const;

1 Effects:

deref_tmp = current;  
--deref_tmp;  
return *deref_tmp;

2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) —end note ]

查看 Debian 8 中由 GCC 4.9 实现的标准库的 header stl_iterator.c 内部:

  /**
   *  @return  A reference to the value at @c --current
   *
   *  This requires that @c --current is dereferenceable.
   *
   *  @warning This implementation requires that for an iterator of the
   *           underlying iterator type, @c x, a reference obtained by
   *           @c *x remains valid after @c x has been modified or
   *           destroyed. This is a bug: http://gcc.gnu.org/PR51823
  */
  reference
  operator*() const
  {
_Iterator __tmp = current;
return *--__tmp;
  }

注意警告:

Warning: This implementation requires that for an iterator of the underlying iterator type, @c x, a reference obtained by @c *x remains valid after @c x has been modified or destroyed. This is a bug: http://gcc.gnu.org/PR51823