汇编 x86 中的 Sqrt

Question

我在网上找到了一些建议。

C Inline assembly - Operand type mismatch for 'fst'
Why am I getting these assembler errors?
Why will I have operand type mismatch error when compiling the assembly codes with gcc?

我有类似的问题，但 none 的建议有所帮助（或者我没有弄清楚如何根据我的程序正确实施它们）。

代码在 C 程序中作为 asm(...) 插入。

用-masm=intel编译后，使用时：

asm ("FLD EBX \n" "FSQRT \n" "FST EBX \n").

我得到编译错误：

"Error: operand type mismatch for 'fld'" “... 'fst' 不匹配”。

EBX 在这些命令之前保存一些整数正值。

那么获得 ebx = sqrt(ebx) 的正确方法是什么？

Answer 1

对于现代代码中的 sqrt，您应该使用 SSE / SSE2，而不是 x87。您可以使用一条指令直接将 gp 寄存器中的整数转换为 xmm 寄存器中的双精度值。

cvtsi2sd  xmm0, ebx
sqrtsd    xmm0, xmm0     ; sd means scalar double, as opposed to SIMD packed double
cvttsd2si  ebx, xmm0     ; convert with truncation (C-style cast)

; cvtsd2si  ecx, xmm0    ; rounded to nearest integer (or whatever the current rounding mode is)

这也适用于 64 位整数 (rbx)，但请注意 double 只能准确表示最大约 2^53（尾数大小）的整数。如果你想检查一个整数是否是一个完美的平方，你可以使用 float sqrt 然后对整数结果进行试乘。 ((a*a) == b)

有关指南、教程和手册的链接，请参阅 x86。

注意把这段代码插入到C程序中间是完全错误的做法。 GNU C 内联 asm 是执行 asm 最困难的方法，因为您必须真正了解所有内容才能正确设置约束。弄错它们会导致其他周围的代码以微妙且难以调试的方式中断，而不仅仅是您使用内联汇编所做的事情是错误的。有关此的更多详细信息，请参阅 x86 标记 wiki。

如果你想要int a = sqrt((int)b)，那么将它写在你的代码中，让编译器为你生成这三个指令。一定要阅读并理解编译器的输出，但不要盲目地在其中插入一个序列 asm("").

例如：

#include <math.h>
int isqrt(int a) { return sqrt(a); }

compiles to（没有 -ffast-math 的 gcc 5.3）：

    pxor    xmm0, xmm0      # D.2569
    cvtsi2sd        xmm0, edi       # D.2569, a
    sqrtsd  xmm1, xmm0  # tmp92, D.2569
    ucomisd xmm1, xmm1        # tmp92, tmp92
    jp      .L7 #,
    cvttsd2si       eax, xmm1     # D.2570, tmp92
    ret
.L7:
    sub     rsp, 8    #,
    call    sqrt    #
    add     rsp, 8    #,
    cvttsd2si       eax, xmm0     # D.2570, tmp92
    ret

我想 sqrt() 必须为某些类型的错误设置 errno。 :/

与-fno-math-errno:

    pxor    xmm0, xmm0      # D.2569
    cvtsi2sd        xmm0, edi       # D.2569, a
    sqrtsd  xmm0, xmm0  # tmp92, D.2569
    cvttsd2si       eax, xmm0     # D.2570, tmp92
    ret

pxor 是为了打破对 xmm0 先前内容的错误依赖，因为 cvtsi2sd 做出了奇怪的设计决定，不修改目标向量 reg 的上半部分。这仅在您想将转换结果插入现有向量时才有用，但已经有 cvtdq2pd 可以进行打包转换。（而且他们可能没有考虑 64 位整数，因为当英特尔发布 SSE2 时 AMD64 仍在绘图板上）。

汇编 x86 中的 Sqrt

Sqrt in Assembly x86

x86

assembly

compiler-errors

mismatch