C++ 中忽略的易失性说明符
Volatile specifier ignored in C++
我是 C++ 的新手,最近我 运行 浏览了一些关于变量是什么意思的信息 volatile
。据我了解,这意味着对变量的读取或写入可以 永远不会 被优化到不存在。
然而,当我声明一个不是 1、2、4、8 字节大的 volatile
变量时,出现了一种奇怪的情况:编译器(启用了 C++11 的 gnu)似乎忽略了 volatile
说明符
#define expand1 a, a, a, a, a, a, a, a, a, a
#define expand2 // ten expand1 here, expand3 to expand5 follows
// expand5 is the equivalent of 1e+005 a, a, ....
struct threeBytes { char x, y, z; };
struct fourBytes { char w, x, y, z; };
int main()
{
// requires ~1.5sec
foo<int>();
// doesn't take time
foo<threeBytes>();
// requires ~1.5sec
foo<fourBytes>();
}
template<typename T>
void foo()
{
volatile T a;
// With my setup, the loop does take time and isn't optimized out
clock_t start = clock();
for(int i = 0; i < 100000; i++);
clock_t end = clock();
int interval = end - start;
start = clock();
for(int i = 0; i < 100000; i++) expand5;
end = clock();
cout << end - start - interval << endl;
}
他们的时间是
foo<int>()
: ~1.5s
foo<threeBytes>()
: 0
我用 1 到 8 个字节的不同变量(用户定义的或非用户定义的)对其进行了测试,只有 1、2、4、8 需要时间 运行。这是一个只存在于我的设置中的错误,还是 volatile
对编译器的请求而不是绝对的?
PS 四字节版本总是比其他版本花费一半的时间,这也是一个混乱的根源
结构版本可能会被优化掉,因为编译器意识到没有副作用(没有读取或写入变量 a
),与 volatile
无关。你基本上有一个空操作,a;
,所以编译器可以做任何它喜欢的事情;它不会被迫展开循环或对其进行优化,因此 volatile
在这里并不重要。在 int
s 的情况下,似乎没有优化,但这与 volatile
的用例一致:你应该期待非优化 only当你在循环中有一个可能的 "access to an object" (即读或写)时。然而,"access to an object" 的构成是实现定义的(尽管大多数时候它遵循常识),请参阅底部的 EDIT 3。
此处为玩具示例:
#include <iostream>
#include <chrono>
int main()
{
volatile int a = 0;
const std::size_t N = 100000000;
// side effects, never optimized
auto start = std::chrono::steady_clock::now();
for (std::size_t i = 0 ; i < N; ++i)
++a; // side effect (write)
auto end = std::chrono::steady_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()
<< " ms" << std::endl;
// no side effects, may or may not be optimized out
start = std::chrono::steady_clock::now();
for (std::size_t i = 0 ; i < N; ++i)
a; // no side effect, this is a no-op
end = std::chrono::steady_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()
<< " ms" << std::endl;
}
编辑
空操作实际上并未针对标量类型进行优化,如您在 this minimal example 中所见。但是,对于 struct
, 它已被 优化。在我链接的例子中,clang
没有优化没有优化的代码,而是用 -O3
优化了两个循环。 gcc
也不会在没有优化的情况下优化循环,而是只优化第一个有优化的循环。
编辑 2
clang
发出警告:warning: expression result unused; assign into a variable to force a volatile load [-Wunused-volatile-lvalue]
。所以我最初的猜测是正确的,编译器可以优化掉空操作,但不是强制的。我不明白为什么它针对 struct
s 而不是标量类型执行此操作,但这是编译器的选择,并且符合标准。出于某种原因,它仅在空操作为 struct
时才会发出此警告,而当它是标量类型时不会发出警告。
另请注意,您没有 "read/write",您只有空操作,因此您不应该对 volatile
.
有任何期望
编辑 3
来自金书(C++标准)
7.1.6.1/8 简历限定符 [dcl.type.cv]
What constitutes an access to an object that has volatile-qualified
type is implementation-defined. ...
因此由编译器决定何时优化循环。在大多数情况下,它遵循常识:读取或写入对象时。
volatile
并不像你想的那样。
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html
如果您依赖 volatile
Boehm 在我链接的页面上提到的三个非常具体的用途之外,您将获得意想不到的结果。
这个问题比它第一次出现时有趣得多(对于 "interesting" 的某些定义)。看起来您发现了一个编译器错误(或故意的不一致),但它并不完全是您所期望的。
根据标准,您的 foo
调用之一具有未定义的行为,另外两个调用格式错误。我将首先解释应该发生什么;休息后可以找到相关的标准报价。出于我们的目的,我们可以只分析给定 volatile T a;
.
的简单表达式语句 a, a, a;
此表达式语句中的 a, a, a
是丢弃值表达式 ([stmt.expr]/p1)。表达式a, a, a
的类型是右操作数的类型,即id-expressiona
或volatile T
;因为 a
是左值,所以表达式 a, a, a
([expr.comma]/p1) 也是。因此,这个表达式是一个 volatile 限定类型的左值,它是一个 "comma expression where the right operand is one of these expressions" - 特别是 id-expression - 因此 [expr]/p11 需要左值到右值的转换应用于表达式 a, a, a
。同样,在a, a, a
内部,左表达式a, a
也是一个弃值表达式,在该表达式内部,左表达式a
也是一个弃值表达式;类似的逻辑表明 [expr]/p11 需要将左值到右值转换应用于表达式 a, a
的结果和表达式 a
的结果(最左边的一个)。
如果 T
是 class 类型(threeBytes
或 fourBytes
),应用左值到右值的转换需要通过复制初始化创建一个临时文件来自易失性左值 a
([conv.lval]/p2)。但是,隐式声明的复制构造函数始终通过非易失性引用 ([class.copy]/p8) 获取其参数;这样的引用不能绑定到 volatile 对象。因此,程序格式错误。
如果 T
是 int
,则应用左值到右值的转换会产生包含在 a
中的值。但是,在您的代码中, a
从未被初始化;因此,此评估会产生不确定的值,并且根据 [dcl.init]/p12,会导致未定义的行为。
标准引述如下。全部来自 C++14:
[表达式]/p11:
In some contexts, an expression only appears for its side effects.
Such an expression is called a discarded-value expression. The
expression is evaluated and its value is discarded. The
array-to-pointer (4.2) and function-to- pointer (4.3) standard
conversions are not applied. The lvalue-to-rvalue conversion (4.1) is
applied if and only if the expression is a glvalue of
volatile-qualified type and it is one of the following:
- ( expression ), where expression is one of these expressions,
- id-expression (5.1.1),
- [several inapplicable bullets omitted], or
- comma expression (5.18) where the right operand is one of these expressions.
[ Note: Using an overloaded operator causes a function call; the
above covers only operators with built-in meaning. If the lvalue is of
class type, it must have a volatile copy constructor to initialize the
temporary that is the result of the lvalue-to-rvalue conversion. —end
note ]
[expr.comma]/p1:
A pair of expressions separated by a comma is evaluated left-to-right;
the left expression is a discarded-value expression (Clause 5) [...] The type
and value of the result are the type and value of the right operand;
the result is of the same value category as its right operand [...].
[stmt.expr]/p1:
Expression statements have the form
expression-statement:
expression_opt;
The expression is a discarded-value expression (Clause 5).
[conv.lval]/p1-2:
1 A glvalue (3.10) of a non-function, non-array type T
can be
converted to a prvalue. If T
is an incomplete type, a program that
necessitates this conversion is ill-formed. If T
is a non-class
type, the type of the prvalue is the cv-unqualified version of T
.
Otherwise, the type of the prvalue is T.
2 [some special rules not relevant here] In all other cases, the
result of the conversion is determined according to the following
rules:
- [inapplicable bullet omitted]
- Otherwise, if
T
has a class type, the conversion copy-initializes a temporary of type T
from the glvalue and the result of the
conversion is a prvalue for the temporary.
- [inapplicable bullet omitted]
- Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.
[dcl.init]/p12:
If no initializer is specified for an object, the object is
default-initialized. When storage for an object with automatic or
dynamic storage duration is obtained, the object has an indeterminate
value, and if no initialization is performed for the object, that
object retains an indeterminate value until that value is replaced
(5.17). [...] If an indeterminate value is produced by an evaluation,
the behavior is undefined except in the following cases: [certain
inapplicable exceptions related to unsigned narrow character types]
[class.copy]/p8:
The implicitly-declared copy constructor for a class X
will have the
form
X::X(const X&)
if each potentially constructed subobject of a class type M
(or
array thereof) has a copy constructor whose first parameter is of type
const M&
or const volatile M&
. Otherwise, the implicitly-declared
copy constructor will have the form
X::X(X&)
我是 C++ 的新手,最近我 运行 浏览了一些关于变量是什么意思的信息 volatile
。据我了解,这意味着对变量的读取或写入可以 永远不会 被优化到不存在。
然而,当我声明一个不是 1、2、4、8 字节大的 volatile
变量时,出现了一种奇怪的情况:编译器(启用了 C++11 的 gnu)似乎忽略了 volatile
说明符
#define expand1 a, a, a, a, a, a, a, a, a, a
#define expand2 // ten expand1 here, expand3 to expand5 follows
// expand5 is the equivalent of 1e+005 a, a, ....
struct threeBytes { char x, y, z; };
struct fourBytes { char w, x, y, z; };
int main()
{
// requires ~1.5sec
foo<int>();
// doesn't take time
foo<threeBytes>();
// requires ~1.5sec
foo<fourBytes>();
}
template<typename T>
void foo()
{
volatile T a;
// With my setup, the loop does take time and isn't optimized out
clock_t start = clock();
for(int i = 0; i < 100000; i++);
clock_t end = clock();
int interval = end - start;
start = clock();
for(int i = 0; i < 100000; i++) expand5;
end = clock();
cout << end - start - interval << endl;
}
他们的时间是
foo<int>()
: ~1.5sfoo<threeBytes>()
: 0
我用 1 到 8 个字节的不同变量(用户定义的或非用户定义的)对其进行了测试,只有 1、2、4、8 需要时间 运行。这是一个只存在于我的设置中的错误,还是 volatile
对编译器的请求而不是绝对的?
PS 四字节版本总是比其他版本花费一半的时间,这也是一个混乱的根源
结构版本可能会被优化掉,因为编译器意识到没有副作用(没有读取或写入变量 a
),与 volatile
无关。你基本上有一个空操作,a;
,所以编译器可以做任何它喜欢的事情;它不会被迫展开循环或对其进行优化,因此 volatile
在这里并不重要。在 int
s 的情况下,似乎没有优化,但这与 volatile
的用例一致:你应该期待非优化 only当你在循环中有一个可能的 "access to an object" (即读或写)时。然而,"access to an object" 的构成是实现定义的(尽管大多数时候它遵循常识),请参阅底部的 EDIT 3。
此处为玩具示例:
#include <iostream>
#include <chrono>
int main()
{
volatile int a = 0;
const std::size_t N = 100000000;
// side effects, never optimized
auto start = std::chrono::steady_clock::now();
for (std::size_t i = 0 ; i < N; ++i)
++a; // side effect (write)
auto end = std::chrono::steady_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()
<< " ms" << std::endl;
// no side effects, may or may not be optimized out
start = std::chrono::steady_clock::now();
for (std::size_t i = 0 ; i < N; ++i)
a; // no side effect, this is a no-op
end = std::chrono::steady_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count()
<< " ms" << std::endl;
}
编辑
空操作实际上并未针对标量类型进行优化,如您在 this minimal example 中所见。但是,对于 struct
, 它已被 优化。在我链接的例子中,clang
没有优化没有优化的代码,而是用 -O3
优化了两个循环。 gcc
也不会在没有优化的情况下优化循环,而是只优化第一个有优化的循环。
编辑 2
clang
发出警告:warning: expression result unused; assign into a variable to force a volatile load [-Wunused-volatile-lvalue]
。所以我最初的猜测是正确的,编译器可以优化掉空操作,但不是强制的。我不明白为什么它针对 struct
s 而不是标量类型执行此操作,但这是编译器的选择,并且符合标准。出于某种原因,它仅在空操作为 struct
时才会发出此警告,而当它是标量类型时不会发出警告。
另请注意,您没有 "read/write",您只有空操作,因此您不应该对 volatile
.
编辑 3
来自金书(C++标准)
7.1.6.1/8 简历限定符 [dcl.type.cv]
What constitutes an access to an object that has volatile-qualified type is implementation-defined. ...
因此由编译器决定何时优化循环。在大多数情况下,它遵循常识:读取或写入对象时。
volatile
并不像你想的那样。
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html
如果您依赖 volatile
Boehm 在我链接的页面上提到的三个非常具体的用途之外,您将获得意想不到的结果。
这个问题比它第一次出现时有趣得多(对于 "interesting" 的某些定义)。看起来您发现了一个编译器错误(或故意的不一致),但它并不完全是您所期望的。
根据标准,您的 foo
调用之一具有未定义的行为,另外两个调用格式错误。我将首先解释应该发生什么;休息后可以找到相关的标准报价。出于我们的目的,我们可以只分析给定 volatile T a;
.
a, a, a;
此表达式语句中的 a, a, a
是丢弃值表达式 ([stmt.expr]/p1)。表达式a, a, a
的类型是右操作数的类型,即id-expressiona
或volatile T
;因为 a
是左值,所以表达式 a, a, a
([expr.comma]/p1) 也是。因此,这个表达式是一个 volatile 限定类型的左值,它是一个 "comma expression where the right operand is one of these expressions" - 特别是 id-expression - 因此 [expr]/p11 需要左值到右值的转换应用于表达式 a, a, a
。同样,在a, a, a
内部,左表达式a, a
也是一个弃值表达式,在该表达式内部,左表达式a
也是一个弃值表达式;类似的逻辑表明 [expr]/p11 需要将左值到右值转换应用于表达式 a, a
的结果和表达式 a
的结果(最左边的一个)。
如果 T
是 class 类型(threeBytes
或 fourBytes
),应用左值到右值的转换需要通过复制初始化创建一个临时文件来自易失性左值 a
([conv.lval]/p2)。但是,隐式声明的复制构造函数始终通过非易失性引用 ([class.copy]/p8) 获取其参数;这样的引用不能绑定到 volatile 对象。因此,程序格式错误。
如果 T
是 int
,则应用左值到右值的转换会产生包含在 a
中的值。但是,在您的代码中, a
从未被初始化;因此,此评估会产生不确定的值,并且根据 [dcl.init]/p12,会导致未定义的行为。
标准引述如下。全部来自 C++14:
[表达式]/p11:
In some contexts, an expression only appears for its side effects. Such an expression is called a discarded-value expression. The expression is evaluated and its value is discarded. The array-to-pointer (4.2) and function-to- pointer (4.3) standard conversions are not applied. The lvalue-to-rvalue conversion (4.1) is applied if and only if the expression is a glvalue of volatile-qualified type and it is one of the following:
- ( expression ), where expression is one of these expressions,
- id-expression (5.1.1),
- [several inapplicable bullets omitted], or
- comma expression (5.18) where the right operand is one of these expressions.
[ Note: Using an overloaded operator causes a function call; the above covers only operators with built-in meaning. If the lvalue is of class type, it must have a volatile copy constructor to initialize the temporary that is the result of the lvalue-to-rvalue conversion. —end note ]
[expr.comma]/p1:
A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-value expression (Clause 5) [...] The type and value of the result are the type and value of the right operand; the result is of the same value category as its right operand [...].
[stmt.expr]/p1:
Expression statements have the form
expression-statement: expression_opt;
The expression is a discarded-value expression (Clause 5).
[conv.lval]/p1-2:
1 A glvalue (3.10) of a non-function, non-array type
T
can be converted to a prvalue. IfT
is an incomplete type, a program that necessitates this conversion is ill-formed. IfT
is a non-class type, the type of the prvalue is the cv-unqualified version ofT
. Otherwise, the type of the prvalue is T.2 [some special rules not relevant here] In all other cases, the result of the conversion is determined according to the following rules:
- [inapplicable bullet omitted]
- Otherwise, if
T
has a class type, the conversion copy-initializes a temporary of typeT
from the glvalue and the result of the conversion is a prvalue for the temporary.- [inapplicable bullet omitted]
- Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.
[dcl.init]/p12:
If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17). [...] If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases: [certain inapplicable exceptions related to unsigned narrow character types]
[class.copy]/p8:
The implicitly-declared copy constructor for a class
X
will have the formX::X(const X&)
if each potentially constructed subobject of a class type
M
(or array thereof) has a copy constructor whose first parameter is of typeconst M&
orconst volatile M&
. Otherwise, the implicitly-declared copy constructor will have the formX::X(X&)