x86 非临时指令:线程本地数据是否需要防护?

x86 Non-Temporal Instructions: Is fencing ever needed for thread-local data?

在 x86/x64 上,MOVNTIMOVNTPS 等非临时存储指令比 "regular" 存储提供更弱的内存排序保证。我知道在共享将跨线程非临时写入的内存时,栅栏(例如 SFENCE)是必要的。但是,线程局部内存是否需要栅栏指令?如果我通过 MOVNTPS 写入某个位置,是否保证在没有任何 fence 指令的情况下写入对同一线程中的后续指令可见?

是的,他们将在没有围栏的情况下可见。请参阅 Intel® 64 和 IA-32 架构软件开发人员手册第 3A 卷:系统编程指南,部分中的 8.2.2 P6 和最新处理器系列中的内存排序 部分1 其中包括:

for memory regions defined as write-back cacheable, [...] Reads may be reordered with older writes to different locations but not with older writes to the same location.

Writes to memory are not reordered with other writes, with the following exceptions: -- streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD);