从内存或高速缓存中获取指令数据的阶段
The stage in which the data for the instruction is fetched from memory or cache
我在现代 CPU 中找不到任何关于指令周期或指令流水线的官方(详细)信息(尤其是对于 AMD Zen+ 和更新版本)。
考虑以下指令:
ADD MEM, REG
[mem]操作数的数据是在哪个阶段从内存中取出的?在(解码)之前还是在执行阶段?
如评论中所述,通常需要先解码指令,然后再从内存中获取数据。
然而,在现代 cpu 中,DCU 会尝试预测您将要使用的数据并在指令被解码之前预取它。这通常适用于访问数组或众所周知的模式。
根据Modern Microprocessors A 90-Minute Guide!
(...) dynamically decode the x86 instructions into simple, RISC-like micro-instructions, which can then be executed by a fast, RISC-style register-renaming OOO superscalar core. (...) Most x86 instructions decode into 1, 2 or 3 μops, while the more complex instructions require a larger number.
OOO(乱序执行),意思是:
processor executes instructions in an order governed by the
availability of input data and execution units, rather than by their
original order in a program. In doing so, the processor can avoid
being idle while waiting for the preceding instruction to complete and
can, in the meantime, process the next instructions that are able to
run immediately and independently.
来源:Wikipedia
- 原始 CISC 指令被解码为 RISC-like μops。
- 现代CPU是超级流水线-superscalar导致指令级并行性,这意味着单核能够并行执行多条指令(每个时钟周期多条指令)。
- 因为数据依赖于执行顺序不是任意的,在给出的示例(
add [mem], eax
)中,ALU 部分(μop)不能在获取 [mem]
的值之前完全执行.
- "内存操作数在执行单独的 μop 时加载,在 add μop 可以执行之前。",就像@peter-cordes 说的那样。
我在现代 CPU 中找不到任何关于指令周期或指令流水线的官方(详细)信息(尤其是对于 AMD Zen+ 和更新版本)。
考虑以下指令:
ADD MEM, REG
[mem]操作数的数据是在哪个阶段从内存中取出的?在(解码)之前还是在执行阶段?
如评论中所述,通常需要先解码指令,然后再从内存中获取数据。 然而,在现代 cpu 中,DCU 会尝试预测您将要使用的数据并在指令被解码之前预取它。这通常适用于访问数组或众所周知的模式。
根据Modern Microprocessors A 90-Minute Guide!
(...) dynamically decode the x86 instructions into simple, RISC-like micro-instructions, which can then be executed by a fast, RISC-style register-renaming OOO superscalar core. (...) Most x86 instructions decode into 1, 2 or 3 μops, while the more complex instructions require a larger number.
OOO(乱序执行),意思是:
processor executes instructions in an order governed by the availability of input data and execution units, rather than by their original order in a program. In doing so, the processor can avoid being idle while waiting for the preceding instruction to complete and can, in the meantime, process the next instructions that are able to run immediately and independently.
来源:Wikipedia
- 原始 CISC 指令被解码为 RISC-like μops。
- 现代CPU是超级流水线-superscalar导致指令级并行性,这意味着单核能够并行执行多条指令(每个时钟周期多条指令)。
- 因为数据依赖于执行顺序不是任意的,在给出的示例(
add [mem], eax
)中,ALU 部分(μop)不能在获取[mem]
的值之前完全执行. - "内存操作数在执行单独的 μop 时加载,在 add μop 可以执行之前。",就像@peter-cordes 说的那样。