使用超线程，一个物理核心的线程通过什么级别的缓存交换 L1/L2/L3？

With Hyper Threading, threads of one physical core are exchanging via what level of cache L1/L2/L3?

超线程是否允许使用 L1 缓存在两个线程之间交换数据，这两个线程在单个物理内核上同时执行，但在两个虚拟内核中？

条件是两者属于同一个进程，即在同一个地址space。

第 85 页 (2-55) - 英特尔® 64 位和 IA-32 架构优化参考手册：http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf

2.5.9 Hyper-Threading Technology Support in Intel® Microarchitecture Code Name Nehalem

...

Deeper buffering and enhanced resource sharing/partition policies:

Replicated resource for HT operation: register state, renamed return stack buffer, large-page ITLB.

Partitioned resources for HT operation: load buffers, store buffers, re-order buffers, small-page ITLB are statically allocated between two logical processors.

Competitively-shared resource during HT operation: the reservation station, cache hierarchy, fill buffers, both DTLB0 and STLB.

Alternating during HT operation: front end operation generally alternates between two logical processors to ensure fairness.

HT unaware resources: execution units.

英特尔架构软件优化手册在第 2.3.9 章中简要描述了如何在内核上的 HT 线程之间共享处理器资源。记录了 Nehalem 架构，变得陈旧但很可能仍然与当前架构相关，因为分区在逻辑上是一致的：

为每个 HT 线程复制：寄存器、return 堆栈缓冲区、大页面 ITLB
为每个 HT 线程静态分配：加载、存储和重新排序缓冲区，小页面 ITLB
在 HT 线程之间竞争共享：保留站、缓存、填充缓冲区、DTLB0 和 STLB。

您的问题符合第 3 个项目符号。在每个 HT 线程执行来自同一进程的代码的非常特殊的情况下，有点意外，您通常可以期望 L1 和 L2 包含一个 HT 线程检索的数据，这些数据可能对另一个有用。请记住，缓存中的存储单元是缓存行，64 字节。以防万一：如果您的 OS 支持，这不是追求线程调度方法的好理由，该方法有利于让两个 HT 线程在同一内核上执行。 HT 线程的运行速度通常比将核心获取到自身的线程慢很多。 30% 是通常的数字，YMMV。

使用超线程，一个物理核心的线程通过什么级别的缓存交换 L1/L2/L3？

With Hyper Threading, threads of one physical core are exchanging via what level of cache L1/L2/L3?

x86

multithreading

x86-64

hyperthreading

smt