编译器优化打破了惰性迭代器
Compiler optimization breaks lazy iterator
我用自定义迭代器编写了一个自定义容器。由于容器的特殊特性,必须延迟计算迭代器。为了这个问题,代码的相关部分是以这种方式实现的迭代器的取消引用运算符
template<typename T>
struct Container
{
vector<T> m_Inner;
// This should calculate the appropriate value.
// In this example is taken from a vec but in
//the real use-case is calculated on request
T Value(int N)
{ m_Inner.at(N); }
}
template<typename T>
struct Lazy_Iterator
{
mutable pair<int, T> m_Current;
int Index
Container<T>* C
Lazy_Iterator(const Container& Cont, int N):
m_Current{Index, T{}}, Index{N}, C{&Cont}
{ }
pair<int, T>&
operator*() const // __attribute__((noinline)) (this cures the symptom)
{
m_Current.first = Index; /// Optimized out
m_Current.second = C->Value(Index); /// Optimized out
return m_Current;
}
}
因为迭代器本身就是一个模板,它的函数可以被编译器自由内联。
当我在没有优化的情况下编译代码时,returned 值会按预期更新。当我在某些情况下使用发布编译器优化(GCC 4.9 中的 -O2)时,编译器优化了我标记为 Optimized out 的行,即使 m_Current 成员被标记为作为可变的。因此,return 值与迭代器应指向的值不匹配。
这是预期的行为吗?您是否知道任何可移植的方法来指定该函数的内容即使被标记为 const 也应该被评估?
我希望这个问题足够详尽以便有用。如果更多详细信息对这种情况有帮助,请提出建议。
编辑:
回答一个评论,这是从一个小测试程序中提取的潜在用法:
Container<double> myC;
Lazy_Iterator<double> It{myC, 0}
cout << "Creation: " << it->first << " , " << it->second << endl;
auto it2 = it;
cout << "Copy: "<< it2->first << " , " << it2->second << endl;
cout << "Pre-increment: " << (it++)->first << " , " << it->second << endl;
cout << "Post-increment: " << (++it)->first << " , " << it->second << endl;
cout << "Pre-decrement: " << (it--)->first << " , " << it->second << endl;
cout << "Post-decrement: " << (--it)->first << " , " << it->second << endl;
cout << "Iterator addition: " << (it+2)->first << " , " << (it+2)->second << endl;
cout << "Iterator subtraction: "<< (it-2)->first << " , " << (it-2)->second << endl;
reverse_iterator<Lazy_Iterator> rit{it};
cout << "Reverse Iterator: " << rit->first << " , " << rit->second << endl;
auto rit2 = rit;
cout << "Reverse Iterator copy: " << rit2->first << " , " << rit2->second << endl;
cout << "Rev Pre-increment: " << (rit++)->first << " , " << rit->second << endl;
cout << "Rev Post-increment: " << (++rit)->first << " , " << rit->second << endl;
cout << "Rev Pre-decrement: " << (rit--)->first << " , " << rit->second << endl;
cout << "Rev Post-decrement: " << (--rit)->first << " , " << rit->second << endl;
cout << "Rev Iterator addition: " << (rit+2)->first << " , " << (rit+2)->second << endl;
cout << "Rev Iterator subtraction: "<< (rit-2)->first << " , " << (rit-2)->second << endl;
除最后两行外,所有测试的测试结果都符合预期
最后两行测试在开启优化的情况下崩溃
该系统实际上运行良好,并不比任何其他迭代器更危险。当然,如果容器在他的眼皮子底下被删除,它会失败,通过复制使用 returned 值可能更安全,而不仅仅是保留引用,但这是题外话
如果你必须 post 一个可重现该问题的可编译片段(实际上我无法用 GCC 4.9 重现它)我认为你有未定义的行为并且是由 O2 触发的(O2 启用优化这可能会破坏未定义的行为)。你应该有一个指向
的指针
Container<T>
在迭代器内部。
无论如何请注意惰性迭代器破坏了 std 迭代器的契约,我认为一个更好的选择是制作一个常规容器惰性值,你可以这样跳到一起创建自定义容器和迭代器 ;)(查看代理模式)。
"Optimized out even though the m_Current
member is marked as mutable"
这告诉我您假设优化器关心 mutable
。它没有。 const
和 mutable
已被较早的编译阶段删除。
为什么优化器会删除这两个语句,如果它们是内联的?我怀疑在内联之后,优化器可以证明这两个写入是空操作,因为 m_Current
变量必须已经持有正确的值, 或 因为后续m_Current
的用法使它没有实际意义。以下情况通常会使这些写入成为空操作:
Lazy_Iterator LI = foo(); // Theoretically writes
*LI = bar(); // Overwrites the previous value.
reverse_iterator
持有的物理迭代器(.base()
返回的)和它指向的逻辑值之间存在差异:它们差一。 reverse_iterator
might do return *(--internal_iterator);
on dereference,这给你留下了对被破坏的局部函数临时内部结构的悬空引用。
再次阅读标准后,我发现它有额外的要求来避免这种情况,请阅读注释。
我还发现 GCC 4.9 标准库不兼容。它使用一个临时的。所以,我认为这是一个 GCC 错误。
编辑:标准报价
24.5.1.3.4 operator* [reverse.iter.op.star]
reference operator*() const;
1 Effects:
deref_tmp = current;
--deref_tmp;
return *deref_tmp;
2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) —end note ]
后续阅读:
Library Defect Report 198.
和it seems that it is returned to old behaviour.
后期编辑:P0031 在 C++17 工作草案中投票。它声明 reverse_iterator
使用临时的,而不是成员来保存中间值。
经过一轮非常有利可图的讨论后,Revolver_Ocelot 的回答让我进一步关注 reverse_iterators 的实施。根据他对标准的引用:
24.5.1.3.4 operator* [reverse.iter.op.star]
reference operator*() const;
1 Effects:
deref_tmp = current;
--deref_tmp;
return *deref_tmp;
2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid
returning a reference that persists beyond the lifetime of its
associated iterator. (See 24.2.) —end note ]
查看 Debian 8 中由 GCC 4.9 实现的标准库的 header stl_iterator.c 内部:
/**
* @return A reference to the value at @c --current
*
* This requires that @c --current is dereferenceable.
*
* @warning This implementation requires that for an iterator of the
* underlying iterator type, @c x, a reference obtained by
* @c *x remains valid after @c x has been modified or
* destroyed. This is a bug: http://gcc.gnu.org/PR51823
*/
reference
operator*() const
{
_Iterator __tmp = current;
return *--__tmp;
}
注意警告:
Warning:
This implementation requires that for an iterator of the
underlying iterator type, @c x, a reference obtained by
@c *x remains valid after @c x has been modified or
destroyed. This is a bug: http://gcc.gnu.org/PR51823
我用自定义迭代器编写了一个自定义容器。由于容器的特殊特性,必须延迟计算迭代器。为了这个问题,代码的相关部分是以这种方式实现的迭代器的取消引用运算符
template<typename T>
struct Container
{
vector<T> m_Inner;
// This should calculate the appropriate value.
// In this example is taken from a vec but in
//the real use-case is calculated on request
T Value(int N)
{ m_Inner.at(N); }
}
template<typename T>
struct Lazy_Iterator
{
mutable pair<int, T> m_Current;
int Index
Container<T>* C
Lazy_Iterator(const Container& Cont, int N):
m_Current{Index, T{}}, Index{N}, C{&Cont}
{ }
pair<int, T>&
operator*() const // __attribute__((noinline)) (this cures the symptom)
{
m_Current.first = Index; /// Optimized out
m_Current.second = C->Value(Index); /// Optimized out
return m_Current;
}
}
因为迭代器本身就是一个模板,它的函数可以被编译器自由内联。
当我在没有优化的情况下编译代码时,returned 值会按预期更新。当我在某些情况下使用发布编译器优化(GCC 4.9 中的 -O2)时,编译器优化了我标记为 Optimized out 的行,即使 m_Current 成员被标记为作为可变的。因此,return 值与迭代器应指向的值不匹配。
这是预期的行为吗?您是否知道任何可移植的方法来指定该函数的内容即使被标记为 const 也应该被评估?
我希望这个问题足够详尽以便有用。如果更多详细信息对这种情况有帮助,请提出建议。
编辑:
回答一个评论,这是从一个小测试程序中提取的潜在用法:
Container<double> myC;
Lazy_Iterator<double> It{myC, 0}
cout << "Creation: " << it->first << " , " << it->second << endl;
auto it2 = it;
cout << "Copy: "<< it2->first << " , " << it2->second << endl;
cout << "Pre-increment: " << (it++)->first << " , " << it->second << endl;
cout << "Post-increment: " << (++it)->first << " , " << it->second << endl;
cout << "Pre-decrement: " << (it--)->first << " , " << it->second << endl;
cout << "Post-decrement: " << (--it)->first << " , " << it->second << endl;
cout << "Iterator addition: " << (it+2)->first << " , " << (it+2)->second << endl;
cout << "Iterator subtraction: "<< (it-2)->first << " , " << (it-2)->second << endl;
reverse_iterator<Lazy_Iterator> rit{it};
cout << "Reverse Iterator: " << rit->first << " , " << rit->second << endl;
auto rit2 = rit;
cout << "Reverse Iterator copy: " << rit2->first << " , " << rit2->second << endl;
cout << "Rev Pre-increment: " << (rit++)->first << " , " << rit->second << endl;
cout << "Rev Post-increment: " << (++rit)->first << " , " << rit->second << endl;
cout << "Rev Pre-decrement: " << (rit--)->first << " , " << rit->second << endl;
cout << "Rev Post-decrement: " << (--rit)->first << " , " << rit->second << endl;
cout << "Rev Iterator addition: " << (rit+2)->first << " , " << (rit+2)->second << endl;
cout << "Rev Iterator subtraction: "<< (rit-2)->first << " , " << (rit-2)->second << endl;
除最后两行外,所有测试的测试结果都符合预期
最后两行测试在开启优化的情况下崩溃
该系统实际上运行良好,并不比任何其他迭代器更危险。当然,如果容器在他的眼皮子底下被删除,它会失败,通过复制使用 returned 值可能更安全,而不仅仅是保留引用,但这是题外话
如果你必须 post 一个可重现该问题的可编译片段(实际上我无法用 GCC 4.9 重现它)我认为你有未定义的行为并且是由 O2 触发的(O2 启用优化这可能会破坏未定义的行为)。你应该有一个指向
的指针Container<T>
在迭代器内部。
无论如何请注意惰性迭代器破坏了 std 迭代器的契约,我认为一个更好的选择是制作一个常规容器惰性值,你可以这样跳到一起创建自定义容器和迭代器 ;)(查看代理模式)。
"Optimized out even though the m_Current
member is marked as mutable"
这告诉我您假设优化器关心 mutable
。它没有。 const
和 mutable
已被较早的编译阶段删除。
为什么优化器会删除这两个语句,如果它们是内联的?我怀疑在内联之后,优化器可以证明这两个写入是空操作,因为 m_Current
变量必须已经持有正确的值, 或 因为后续m_Current
的用法使它没有实际意义。以下情况通常会使这些写入成为空操作:
Lazy_Iterator LI = foo(); // Theoretically writes
*LI = bar(); // Overwrites the previous value.
reverse_iterator
持有的物理迭代器(.base()
返回的)和它指向的逻辑值之间存在差异:它们差一。 reverse_iterator
might do return *(--internal_iterator);
on dereference,这给你留下了对被破坏的局部函数临时内部结构的悬空引用。
再次阅读标准后,我发现它有额外的要求来避免这种情况,请阅读注释。
我还发现 GCC 4.9 标准库不兼容。它使用一个临时的。所以,我认为这是一个 GCC 错误。
编辑:标准报价
24.5.1.3.4 operator* [reverse.iter.op.star]
reference operator*() const;
1 Effects:
deref_tmp = current; --deref_tmp; return *deref_tmp;
2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) —end note ]
后续阅读: Library Defect Report 198.
和it seems that it is returned to old behaviour.
后期编辑:P0031 在 C++17 工作草案中投票。它声明 reverse_iterator
使用临时的,而不是成员来保存中间值。
经过一轮非常有利可图的讨论后,Revolver_Ocelot 的回答让我进一步关注 reverse_iterators 的实施。根据他对标准的引用:
24.5.1.3.4 operator* [reverse.iter.op.star]
reference operator*() const;
1 Effects:
deref_tmp = current; --deref_tmp; return *deref_tmp;
2 [ Note: This operation must use an auxiliary member variable rather than a temporary variable to avoid returning a reference that persists beyond the lifetime of its associated iterator. (See 24.2.) —end note ]
查看 Debian 8 中由 GCC 4.9 实现的标准库的 header stl_iterator.c 内部:
/**
* @return A reference to the value at @c --current
*
* This requires that @c --current is dereferenceable.
*
* @warning This implementation requires that for an iterator of the
* underlying iterator type, @c x, a reference obtained by
* @c *x remains valid after @c x has been modified or
* destroyed. This is a bug: http://gcc.gnu.org/PR51823
*/
reference
operator*() const
{
_Iterator __tmp = current;
return *--__tmp;
}
注意警告:
Warning: This implementation requires that for an iterator of the underlying iterator type, @c x, a reference obtained by @c *x remains valid after @c x has been modified or destroyed. This is a bug: http://gcc.gnu.org/PR51823