预解码器和解码器。区别

Predecoders and decoders. Difference

正在看Agner Fog的资料，有一些疑惑：

The pre-decoders and decoders can handle 16 bytes or 4 instructions per clock cycle

"The pre-decoder will find and mark the instruction boundaries, decode any prefixes and check for certain properties (e.g. branches)." (Source) (Another article)
L1指令缓存是宏指令的主要缓存。循环缓冲区存储一小段宏指令序列（如 32 字节），这对于紧密循环很有用，与从 L1 缓存读取相比可以节省延迟和功率。
"The register renaming (RAT) and retirement (RRF) stages in the pipeline are bottlenecks with a maximum throughput of 3 μops per clock cycle. In order to get more through these bottlenecks, the designers have joined some operations together that were split in two μops in previous processors. They call this μop fusion. The fused operations share a single μop in most of the pipeline and a single entry in the reorder buffer (ROB). But this single ROB entry represents two operations that have to be done by two different execution units. The fused ROB entry is dispatched to two different execution ports but is retired as a single unit." (Source)

宏操作融合是一种识别成为一个微操作的宏指令序列的方法。最常见的例子是在较新的 Intel CPU 上，一个 CMP + JMP 融合成一个微操作。