为什么这个 asm strcmp() return 错误的值

Question

我正在尝试学习一些汇编（英特尔语法，x86_64）。我写这段代码是为了简单地实现 strcmp():

section .text
    global ft_strcmp

ft_strcmp:
    cmp byte [rsi], 0
    je  exit
    mov cl, byte [rsi]
    cmp byte [rdi], cl
    jne exit
    inc rsi
    inc rdi
    jmp ft_strcmp

exit:
    xor rax, rax
    mov al, [rdi]
    sub al, byte [rsi]
    ret

但是通过调用 ft_strcmp("Hello", "Hellooooo") return 145 来尝试，而真正的 strcmp() return -1，我似乎不能弄清楚为什么。是我的语法有误，还是我尝试这样做的方式？

Answer 1

您还需要检查 [rdi] 是否为空。第二个字符串可能比第一个短。

Answer 2

strcmp 应该 return eax 中的 32 位 int 根据第一个字符串是更大还是更小，它是正数还是负数。通过执行 8 位减法，eax 的高 24 位保持为零，因此当作为有符号整数查看时，结果为正数。

你想做一个 32 位减法，所以你需要将 32 位寄存器中的两个字节的高 24 位置零。这是有效的 movzx:

exit:
    movzx eax, byte [rdi]
    movzx ecx, byte [rsi]
    sub eax, ecx
    ret

如果你不知道movzx，你可以将整个寄存器置零，然后加载低字节：

exit:
    xor eax, eax
    mov al, [rdi] ; 'byte' is unnecessary, operand size inferred from register al
    xor ecx, ecx
    mov cl, [rsi]
    sub eax, ecx
    ret

（作为旁注，像xor rax, rax这样的归零指令可以用更小但效果相同的xor eax, eax代替：Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?）

Answer 3

你是对的，第二个字符串 nul 检查不在比较操作中，太棒了！

考虑以下几点：

内存引用越少越好：您的代码在循环中有 3 个数据内存引用，在退出时有 2 个 (3N+2)。此代码在循环中有 2 个数据内存引用，在退出时有 0 个（2N）。

指令字节越少越好：您的代码有 37 个字节的循环代码和 14 个字节的退出代码。此代码有 20 个字节的循环代码和 3 个字节的退出代码。

一个技巧是使用return值寄存器作为工作寄存器之一。另一个技巧是将操作数移动到寄存器中，然后在寄存器中进行操作。

编译完成于 https://defuse.ca/online-x86-assembler.htm#disassembly

代码未经测试。

ft_strcmp:
    movzx cl,[rsi]      #[2] get byte(1) from 1st string
    movzx al,[rdi]      #[2] get byte(2) from 2nd string
    test cl,cl          #[2] end of first string, is it nul?
    je exit             #[2] 
    cmp cl,al           #[2] compare byte(1) w/ byte(2)
    jne exit            #[2] differ?
    inc rsi             #[3] point to next byte
    inc rdi             #[3] ditto
    jmp ft_strcmp       #[2] test them
                        #[20] bytes of instructions in loop
exit:
    sub al, cl          #[2] generate return value
                        # if neither null; return difference of values
                        # if byte(1) is null and byte(2) is null; ret 0
                        # if byte(1) is null and byte(2) is not null; return positive
                        # if byte(1) is not null and byte(2) is null; return negative 
    ret                 #[1]
                        #[3] bytes of instructions in exit
    
    
ft_strcmp:
    cmp byte [rsi], 0   #[7]
    je  exit            #[2]
    mov cl, byte [rsi]  #[6]
    cmp byte [rdi], cl  #[6]
    jne exit            #[2]
    inc rsi             #[6]
    inc rdi             #[6]
    jmp ft_strcmp       #[2]
                        #[37] bytes of loop code
exit:
    xor rax, rax        #[2]
    mov al, [rdi]       #[5]
    sub al, byte [rsi]  #[6]
    ret                 #[1]
                        #[14] bytes of exit code

Answer 4

直接实施 - 使用 Compiler Explorer :

strcmp:
        xor     ecx, ecx
.L3:
        movzx   eax, BYTE PTR [rdi+rcx]
        movzx   edx, BYTE PTR [rsi+rcx]
        cmp     al, dl
        jne     .L2
        inc     rcx
        test    al, al
        jne     .L3
.L2:
        sub     eax, edx
        ret

字节正确地进行了零扩展，因此 return 值是 unsigned 字符的差异，结果提升为 int.

正如其他人所指出的，对于 C 字符串可能具有任意长度的 C 库实现，可以使用机器字或 SSE 指令，前提是设置成本被分摊到某个长度阈值。

为什么这个 asm strcmp() return 错误的值

why does this asm strcmp() return wrong values

assembly

x86-64

strcmp