当函数 returns 时,我如何记录通过引用传递给函数的变量是否需要再次缓存?
How can I document if variables passed by reference to a function need to be cached again when the function returns?
(精彩的)书 c++ templates(第 109 页)中的以下语句向我建议,通过引用将参数传递给函数可能会迫使处理器再次缓存关联的变量:
Under the hood, passing an argument by reference is implemented by passing the address of the argument. Addresses are encoded compactly, and therefore transferring an address from the caller to the callee is efficient in itself. However, passing an address can create uncertainties for the compiler when it compiles the caller’s code: What is the callee doing with that address? In theory, the callee can change all the values that are “reachable” through that address. That means, that the compiler has to assume that all the values it may have cached (usually, in machine registers) are invalid after the call. Reloading all those values can be quite expensive.
围绕引用的讨论提到在函数调用后也可能再次缓存 const-reference returns。看完这篇文章后,我想到了以下两个问题:
- 是否可以验证(可能使用工具)函数调用后何时再次缓存变量?我知道可以分析程序的平均缓存性能,但我想知道我是否可以将缓存性能与特定功能线相关联?
- 有人可以构建一个示例来证明调用带有引用参数的函数会使处理器的缓存无效吗?
您通常无法编写与 CPU cache 相关的可移植 C++ 代码。一些微控制器没有任何缓存。一些高端处理器有好几级。
注意 GCC builtin __builtin_prefetch
.
我相信其他一些编译器也有类似的事情
但另见 this answer。在许多情况下,您不应该使用 __builtin_prefetch
(精彩的)书 c++ templates(第 109 页)中的以下语句向我建议,通过引用将参数传递给函数可能会迫使处理器再次缓存关联的变量:
Under the hood, passing an argument by reference is implemented by passing the address of the argument. Addresses are encoded compactly, and therefore transferring an address from the caller to the callee is efficient in itself. However, passing an address can create uncertainties for the compiler when it compiles the caller’s code: What is the callee doing with that address? In theory, the callee can change all the values that are “reachable” through that address. That means, that the compiler has to assume that all the values it may have cached (usually, in machine registers) are invalid after the call. Reloading all those values can be quite expensive.
围绕引用的讨论提到在函数调用后也可能再次缓存 const-reference returns。看完这篇文章后,我想到了以下两个问题:
- 是否可以验证(可能使用工具)函数调用后何时再次缓存变量?我知道可以分析程序的平均缓存性能,但我想知道我是否可以将缓存性能与特定功能线相关联?
- 有人可以构建一个示例来证明调用带有引用参数的函数会使处理器的缓存无效吗?
您通常无法编写与 CPU cache 相关的可移植 C++ 代码。一些微控制器没有任何缓存。一些高端处理器有好几级。
注意 GCC builtin __builtin_prefetch
.
我相信其他一些编译器也有类似的事情
但另见 this answer。在许多情况下,您不应该使用 __builtin_prefetch