x86 MASKMOVDQU 指令的所有 16 个字节都必须是有效内存吗?
Must all 16 bytes of an x86 MASKMOVDQU instruction be valid memory?
当使用 x86 MASKMOVDQU
指令时,目标位置是否必须始终有 16 个字节的可写内存,即使某些掩码位为零?
例如,假设我使用 MASKMOVDQU
写入地址 0x12345FFC
。 0x12345000
处的页面是有效内存,但 0x12346000
处的页面不是。如果掩码寄存器是0x00000000'00000000'00000000'FFFFFFFF
,这个MASKMOVDQU
会一直有效吗,还是会出现异常?
英特尔手册说明了以下关于全零掩码的内容,但没有提到我正在谈论的边缘情况:
Behavior with a mask of all 0s is as follows:
• No data will be
written to memory.
• Signaling of breakpoints (code or data) is not
guaranteed; different processor implementations may signal or not
signal these breakpoints.
• Exceptions associated with addressing
memory and page faults may still be signaled (implementation
dependent).
• If the destination memory region is mapped as UC or WP,
enforcement of associated semantics for these memory types is not
guaranteed (that is, is reserved) and is implementation-specific.
参见第三个要点。这特别说明即使所有掩码都为零,异常仍可能发生。当然,这意味着可能会为屏蔽写入生成异常。
的确,AMD手册在这个问题上写的比较清楚:
Exception and trap behavior for elements not selected for storage to
memory are implementation dependent. For instance, a given
implementation may signal a data breakpoint or a page fault for bytes
that are zero-masked and not actually written.
当使用 x86 MASKMOVDQU
指令时,目标位置是否必须始终有 16 个字节的可写内存,即使某些掩码位为零?
例如,假设我使用 MASKMOVDQU
写入地址 0x12345FFC
。 0x12345000
处的页面是有效内存,但 0x12346000
处的页面不是。如果掩码寄存器是0x00000000'00000000'00000000'FFFFFFFF
,这个MASKMOVDQU
会一直有效吗,还是会出现异常?
英特尔手册说明了以下关于全零掩码的内容,但没有提到我正在谈论的边缘情况:
Behavior with a mask of all 0s is as follows:
• No data will be written to memory.
• Signaling of breakpoints (code or data) is not guaranteed; different processor implementations may signal or not signal these breakpoints.
• Exceptions associated with addressing memory and page faults may still be signaled (implementation dependent).
• If the destination memory region is mapped as UC or WP, enforcement of associated semantics for these memory types is not guaranteed (that is, is reserved) and is implementation-specific.
参见第三个要点。这特别说明即使所有掩码都为零,异常仍可能发生。当然,这意味着可能会为屏蔽写入生成异常。
的确,AMD手册在这个问题上写的比较清楚:
Exception and trap behavior for elements not selected for storage to memory are implementation dependent. For instance, a given implementation may signal a data breakpoint or a page fault for bytes that are zero-masked and not actually written.