编译器将 printf 更改为 puts

Question

考虑以下代码：

#include <stdio.h>

void foo() {
    printf("Hello world\n");
}

void bar() {
    printf("Hello world");
}

这两个函数生成的程序集是：

.LC0:
        .string "Hello world"
foo():
        mov     edi, OFFSET FLAT:.LC0
        jmp     puts
bar():
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        jmp     printf

现在我知道了 puts and printf 之间的区别，但我发现 gcc 能够自省 const char* 并确定是调用 printf 还是调用 puts，这一点非常有趣。

另一个有趣的事情是，在 bar 中，编译器将 return 寄存器 (eax) 置零，即使它是一个 void 函数。为什么它在那里而不是在 foo 中这样做？

我假设编译器 'introspected my string' 是否正确，或者对此有另一种解释？

Answer 1

Am I correct in assuming that compiler 'introspected my string', or there is another explanation of this?

是的，这正是发生的事情。这是编译器完成的一个非常简单和常见的优化。

由于您的第一个 printf() 电话是：

printf("Hello world\n");

相当于：

puts("Hello world");

由于 puts() 不需要扫描和解析格式说明符的字符串，因此它比 printf() 快得多。编译器注意到您的字符串以换行符结尾并且不包含格式说明符，因此会自动转换调用。

这也节省了一点 space，因为现在只有一个字符串 "Hello world" 需要存储在生成的二进制文件中。

请注意，对于以下形式的调用，这通常是不可能的：

printf(some_var);

如果some_var不是一个简单的常量字符串，编译器无法知道它是否以\n结尾。

其他常见的优化是：

strlen("constant string") 可能会在编译时求值并转换为数字。

location1

location2

memmove(location1, location2, sz) 可能会转换为 memcpy()。
memcpy() 小尺寸可以在单个 mov 指令中转换，即使尺寸较大，有时也可以内联调用以加快速度。

Another interesting thing is that in bar, compiler zero'ed out the return register (eax) even though it is a void function. Why did it do that there and not in foo?

看这里：Why is %eax zeroed before a call to printf?

相关有趣的帖子

Can printf get replaced by puts automatically in a C program?

Answer 2

Another interesting thing is that in bar, compiler zero'ed out the return register (eax) even though it is a void function. Why did it do that there and not in foo?

这与标题中的问题完全无关，但none越少越有趣。

异或归零 %eax 在之前对 printf 的调用因此是调用的一部分，与 return 值无关。发生这种情况的原因是 printf 是可变参数函数，而可变参数函数的 x86_64 ABI 需要在 xmm 寄存器中传递 floating-point 参数，并且需要在 %al 中传递此类参数的数量.所以这条指令是为了确保 %al 为 0，因为没有参数在 xmm 寄存器中传递给 printf。

puts 不是可变参数函数，因此不需要它。

编译器将 printf 更改为 puts

Compiler changes printf to puts

c

assembly

gcc

compiler-optimization

相关有趣的帖子