如何将 C 中 NULL 的等价物推送到汇编中的堆栈？

Question

我正在用汇编语言编写用于字符串排序的冒泡排序，我正在使用 strtok() 来标记字符串。但是，在第一次调用 strtok(str," ") 之后，我需要将 NULL 作为参数传递，即 strtok(NULL," ")

我已经在 .bss 段中尝试了 NULL equ 0 但这没有任何作用。

[SECTION .data]

[SECTION .bss]

string resb 64
NULL equ 0

[SECTION .text]

extern fscanf
extern stdin
extern strtok

global main

main:

    push ebp        ; Set up stack frame for debugger
    mov ebp,esp
    push ebx        ; Program must preserve ebp, ebx, esi, & edi
    push esi
    push edi

    push cadena
    push frmt
    push dword [stdin]      ;Read string from stdin
    call fscanf
    add esp,12              ;clean stack

    push delim
    push string             ;this works
    call strtok
    add esp,8               ;clean stack

    ;after this step, the return value in eax points to the first word 

    push string             ;this does not
    push NULL
    call strtok
    add esp,8               ;clean stack

    ;after this step, eax points to 0x0

    pop edi         ; Restore saved registers
    pop esi
    pop ebx
    mov esp,ebp     ; Destroy stack frame before returning
    pop ebp
    ret         ;return control to linux

我在 "most implementations" 中读到过 NULL 指向 0，不管那是什么意思。为什么会有歧义？ x86指令集中的NULL等价于什么？

Answer 1

 push NULL 
 push string 
 call strtok

这是在调用 strtok(string, NULL)。你想要 strtok(NULL, " ")，所以假设 delim 包含 " " :

 push delim
 push NULL
 call strtok

参数按 cdecl 调用约定中的相反 (right-to-left) 顺序入栈。

对于您问题的另一部分（NULL 始终为零），请参阅：Is NULL always zero in C?

Answer 2

I've read that in "most implementations" NULL points to 0, whatever that means.

不，它是0；它不是到任何东西的指针。所以是的，NULL equ 0 是正确的，或者只是 push 0.

在 C 源代码中，(void*)0 始终为 NULL，但允许实现在内部使用不同的 non-zero bit-pattern 作为 int *p = NULL; 的 object-representation .选择 non-zero bit-pattern 的实现需要在编译时进行翻译。（翻译仅在编译时适用于指针上下文中出现的值为零的 compile-time 整数常量表达式，不适用于 memset 或其他任何东西。） C++ 常见问题解答有一整节关于 NULL pointers。（在这种情况下也适用于 C。）

(在 C 中使用 memcpy 访问一个对象的 bit-pattern 是合法的，或者使用 (char*) 别名到它上面，所以可以在 well-formed 没有未定义行为的程序。或者当然是通过使用调试器查看 asm 或内存内容！在实践中，您可以通过编译 int*foo(){return NULL;} )[=46= 轻松检查 NULL 的正确 asm ]

另请参阅 Why is address zero used for the null pointer? 了解更多背景信息。

Why is there ambiguity? What is the equivalent to NULL in x86 instruction set?

在所有 x86 调用约定/ABI 中，空指针的 asm bit-pattern 是整数 0.

因此 push 0 或 xor edi,edi (RDI=0) 始终是您在 x86 / x86-64 上想要的。（现代调用约定，包括所有 x86-64 约定，在寄存器中传递参数。）Windows x64 在 RCX 中传递第一个参数，而不是 RDI。

@J... 的回答显示了如何 按 right-to-left 顺序为您正在使用的调用约定推送参数 ，结果是第一个 (left-most) arg 在最低地址。

实际上，您可以根据需要将它们存储到堆栈中（例如使用 mov），只要它们在 call 运行时位于正确的位置即可。

C 标准允许它有所不同，因为某些硬件上的 C 实现可能想要使用其他东西，例如一个特殊的 bit-pattern，无论上下文如何，在取消引用时总是出错。或者，如果 0 是实际程序中的有效地址值，那么如果 p==NULL 对于有效指针始终为 false 则更好。或者任何其他神秘的 hardware-specific 原因。

所以是的，可能已经有一些 x86 的 C 实现，其中 C 源代码中的 (void*)0 变成了 asm 中的 non-zero 整数。但实际上没有。（而且大多数程序员很高兴 memset(array_of_pointers, 0, size) 实际上将它们设置为 NULL，这依赖于 bit-pattern 是 0，因为一些代码做出了这个假设而没有考虑它不能保证的事实便携）。

在 x86 上的任何标准 C ABI 中都没有这样做。（ABI 是一组编译器都遵循的实现选择，因此它们的代码可以相互调用，例如同意结构布局、调用约定以及 p == NULL 的含义。）

我不知道有任何现代 C 实现在其他 32 位或 64 位 CPU 上使用 non-zero NULL；虚拟内存可以很容易地避免地址 0.

http://c-faq.com/null/machexamp.html有一些历史例子：

The Prime 50 series used segment 07777, offset 0 for the null pointer, at least for PL/I. Later models used segment 0, offset 0 for null pointers in C, necessitating new instructions such as TCNP (Test C Null Pointer), evidently as a sop to [footnote] all the extant poorly-written C code which made incorrect assumptions. Older, word-addressed Prime machines were also notorious for requiring larger byte pointers (char *) than word pointers (int *).

... see the link for more machines, and the footnote from this paragraph.

https://www.quora.com/On-which-actual-architectures-is-Cs-null-pointer-not-a-binary-zero-all-bits-zero 报告在 286 Xenix 上发现 non-zero NULL，我猜是使用分段指针。

现代 x86 操作系统确保进程不能将任何内容映射到虚拟地址的最低页space，因此 NULL 指针取消引用总是大声出错以使调试更容易.

例如Linux 默认保留地址 space (vm.mmap_min_address) 的低 64kiB。这有助于它是否来自源中的 NULL 指针，或者是否有其他错误将指针归零为整数零。 64k 而不仅仅是低 4k 页面将指针作为数组进行索引，例如 p[i] 具有中小 i 值。

有趣的事实：Windows 95 将 user-space 虚拟地址 space 的最低页面映射到物理内存的第一个 64kiB，以解决 386 B1 步进错误。但幸运的是，它能够进行设置，因此来自普通 32 位进程的访问确实出错了。不过，DOS 兼容模式下的 16 位代码运行很容易破坏整个机器。

见https://devblogs.microsoft.com/oldnewthing/20141003-00/?p=43923 and https://news.ycombinator.com/item?id=13263976

Answer 3

你实际上是在问两个问题：

问题一

I've read that ... NULL points to 0, whatever that means.

这意味着几乎所有的 C 编译器都将 NULL 定义为 (void *)0。

这意味着 NULL 指针是指向地址为零的内存位置的指针。

I've read that in "most implementations" ...

"Most"表示在1980年代后期引入ISO C和ANSI C之前，有C编译器以不同的方式定义 NULL。

也许 少数 non-standard C 编译器仍然存在，它们不将地址 0 识别为 NULL。

但是，您可以假设您的 C 编译器和您在汇编项目中使用的 C 库将 NULL 定义为指向地址 0 的指针。

问题二

How do I push the equivalent of NULL in C to the stack in assembly?

指针是地址。

（与其他一些 CPU 不同），x86 CPU 不区分整数和地址：

您通过压入整数值 0 来压入一个 NULL 指针。

NULL equ 0

push NULL

很遗憾，您没有编写您使用的汇编程序。（其他用户假设它是 NASM。）

在这种情况下，指令 push NULL 可能会被不同的汇编程序以两种不同的方式解释：

一些汇编器会将其解释为：“压入值 0”。

这是正确的。
其他汇编程序会将其解释为：“读取内存位置 0 的内存并压入该值”

这将等于 C 中的 someFunction(*(int *)NULL)，因此会导致异常（NULL 指针访问）。

如何将 C 中 NULL 的等价物推送到汇编中的堆栈？

How do I push the equivalent of NULL in C to the stack in assembly?

c

x86

assembly

nasm

null-pointer