没有 `...` 的可变参数函数

Question

在 x86_64/Linux 上，使用 gcc/clang -O3:

编译

void void_unspec0(),void_unspec1(),void_unspec2(),void_unspec3(),void_void(void);

void call_void_void()
{
    void_void();
    void_void();
    void_void();
    void_void();
    void_void();
}

void call_void_unspec()
{
    void_unspec0();
    void_unspec0();
    void_unspec0();
    void_unspec0();
    void_unspec1(.0,.0,.0);
    void_unspec2(.0,.0,.0,.0,.0,.0,.0,.0);
    void_unspec3(.0,.0,.0,.0,.0,.0,.0,.0,.0,.0);
}

反汇编为：

0000000000000000 <call_void_void>:
   0:   48 83 ec 08             sub    [=11=]x8,%rsp
   4:   e8 00 00 00 00          callq  9 <call_void_void+0x9>
   9:   e8 00 00 00 00          callq  e <call_void_void+0xe>
   e:   e8 00 00 00 00          callq  13 <call_void_void+0x13>
  13:   e8 00 00 00 00          callq  18 <call_void_void+0x18>
  18:   48 83 c4 08             add    [=11=]x8,%rsp
  1c:   e9 00 00 00 00          jmpq   21 <call_void_void+0x21>
  21:   66 66 2e 0f 1f 84 00    data16 nopw %cs:0x0(%rax,%rax,1)
  28:   00 00 00 00 
  2c:   0f 1f 40 00             nopl   0x0(%rax)

0000000000000030 <call_void_unspec>:
  30:   48 83 ec 08             sub    [=11=]x8,%rsp
  34:   31 c0                   xor    %eax,%eax
  36:   e8 00 00 00 00          callq  3b <call_void_unspec+0xb>
  3b:   31 c0                   xor    %eax,%eax
  3d:   e8 00 00 00 00          callq  42 <call_void_unspec+0x12>
  42:   31 c0                   xor    %eax,%eax
  44:   e8 00 00 00 00          callq  49 <call_void_unspec+0x19>
  49:   31 c0                   xor    %eax,%eax
  4b:   e8 00 00 00 00          callq  50 <call_void_unspec+0x20>
  50:   66 0f ef d2             pxor   %xmm2,%xmm2
  54:   b8 03 00 00 00          mov    [=11=]x3,%eax
  59:   66 0f ef c9             pxor   %xmm1,%xmm1
  5d:   66 0f ef c0             pxor   %xmm0,%xmm0
  61:   e8 00 00 00 00          callq  66 <call_void_unspec+0x36>
  66:   66 0f ef ff             pxor   %xmm7,%xmm7
  6a:   b8 08 00 00 00          mov    [=11=]x8,%eax
  6f:   66 0f ef f6             pxor   %xmm6,%xmm6
  73:   66 0f ef ed             pxor   %xmm5,%xmm5
  77:   66 0f ef e4             pxor   %xmm4,%xmm4
  7b:   66 0f ef db             pxor   %xmm3,%xmm3
  7f:   66 0f ef d2             pxor   %xmm2,%xmm2
  83:   66 0f ef c9             pxor   %xmm1,%xmm1
  87:   66 0f ef c0             pxor   %xmm0,%xmm0
  8b:   e8 00 00 00 00          callq  90 <call_void_unspec+0x60>
  90:   66 0f ef c0             pxor   %xmm0,%xmm0
  94:   6a 00                   pushq  [=11=]x0
  96:   66 0f ef ff             pxor   %xmm7,%xmm7
  9a:   6a 00                   pushq  [=11=]x0
  9c:   66 0f ef f6             pxor   %xmm6,%xmm6
  a0:   b8 08 00 00 00          mov    [=11=]x8,%eax
  a5:   66 0f ef ed             pxor   %xmm5,%xmm5
  a9:   66 0f ef e4             pxor   %xmm4,%xmm4
  ad:   66 0f ef db             pxor   %xmm3,%xmm3
  b1:   66 0f ef d2             pxor   %xmm2,%xmm2
  b5:   66 0f ef c9             pxor   %xmm1,%xmm1
  b9:   e8 00 00 00 00          callq  be <call_void_unspec+0x8e>
  be:   48 83 c4 18             add    [=11=]x18,%rsp
  c2:   c3                      retq

在第二种情况 (call_void_unspec()) 中，编译器正在计算寄存器中传递的浮点参数，大概是因为 SysVABI/AMD64 spec 说它应该。

When a function taking variable-arguments is called, %rax must be set to the total number of floating point parameters passed to the function in SSE registers

ABI 规范中的规则的原因是什么？考虑到用 ...（省略号）定义的函数在调用之前是 required to be prototyped (6.5.2.2p6)，非原型函数调用必须遵守它吗？没有 ... 的函数也可以是可变的吗？

Answer 1

C 标准所说的

请注意，可变参数函数只能在存在原型时调用。如果你尝试在没有原型的情况下调用 printf()，你会得到 UB（未定义的行为）。

C11 §6.5.2.2 Function calls ¶6 说：

¶6 If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined. If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined. If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:

one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;

both types are pointers to qualified or unqualified versions of a character type or void.

应用于问题中的原始代码

问题中的原始代码与此类似——连续的相同函数调用已减少为单个调用。

void void_unspec(), void_void(void);

void call_void_void()
{
    void_void();
}

void call_void_unspec()
{
    void_unspec();
    void_unspec(.0,.0,.0);
    void_unspec(.0,.0,.0,.0,.0,.0,.0,.0);
    void_unspec(.0,.0,.0,.0,.0,.0,.0,.0,.0,.0);
}

此代码调用 UB，因为调用 void_unspec() 的函数的参数数量并不完全匹配它定义为采用的参数数量（无论定义是什么；它不能同时采用 0， 3、8 和 10 个参数）。这不是约束冲突，因此不需要诊断。编译器通常会做任何它认为对向后兼容最好的事情，通常不会导致彻底崩溃，但程序员因违反标准规则而产生的任何问题。

而且因为标准说行为是未定义的，所以没有特定的理由要求编译器必须设置 %rax（当然，C 标准对 %rax 一无所知），但是简单的一致性表明它应该。

应用于题中修改后的代码

题目中的代码修改成这样（重复连续调用再次省略）：

void void_unspec0(), void_unspec1(), void_unspec2(), void_unspec3(), void_void(void);

void call_void_void()
{
    void_void();
}

void call_void_unspec()
{
    void_unspec0();
    void_unspec1(.0,.0,.0);
    void_unspec2(.0,.0,.0,.0,.0,.0,.0,.0);
    void_unspec3(.0,.0,.0,.0,.0,.0,.0,.0,.0,.0);
}

代码不再不可避免地调用未定义的行为。但是，在定义了 void_unspec0() 等函数的地方，它们应该类似于：

void void_unspec0(void) { … }
void void_unspec1(double a, double b, double c) { … }
void void_unspec2(double a, double b, double c, double d, double e, double f, double g, double h) { … }
void void_unspec3(double a, double b, double c, double d, double e, double f, double g, double h, double i, double j) { … }

一个等效的表示法是：

void void_unspec2(a, b, c, d, e, f, g, h)
    double a, b, c, d, e, f, g, h;
{
    …
}

这是使用 K&R 预标准非原型定义。

如果函数定义与这些不匹配，则 §6.5.2.2¶6 表示调用的结果是未定义的行为。这样一来，标准就不必对各种可疑情况下发生的事情进行立法。和以前一样，编译器可以随意传递 %rax 中浮点数的个数；这就说得通了。但是在争论将会发生什么的方式上几乎没有什么可以做的——要么调用符合定义并且一切正常，要么它们不匹配并且存在未指定（和无法指定）的潜在问题。

顺便注意，call_void_void() 和 call_void_unspec() 都没有用原型定义。它们都是采用零参数的函数，但没有可见的原型强制执行，因此同一文件中的代码可以调用 call_void_void(1, "abc") 而编译器不会抱怨。（在这方面，与许多其他语言一样，C++ 是一种不同的语言，具有不同的规则。）

Answer 2

Can functions without ... be variadic too?

Paragraph 6.5.2.2/6 的标准也许是最相关的：

If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions. If the number of arguments does not equal the number of parameters, the behavior is undefined.

（强调已添加。）这是当函数的声明类型不包含参数列表时的情况（区别于参数列表仅包含 void）。调用者仍然负责传递正确数量的参数。

If the function is defined with a type that includes a prototype, and either the prototype ends with an ellipsis (, ...) or the types of the arguments after promotion are not compatible with the types of the parameters, the behavior is undefined.

这是为了区分函数definition的属性和表示函数的函数call的子表达式的类型。请注意，它明确指出通过类型不包含原型的函数表达式调用可变参数函数的行为是未定义的。它还需要在提升的参数和参数之间进行类型匹配。

If the function is defined with a type that does not include a prototype, and the types of the arguments after promotion are not compatible with those of the parameters after promotion, the behavior is undefined, except for the following cases:

one promoted type is a signed integer type, the other promoted type is the corresponding unsigned integer type, and the value is representable in both types;

both types are pointers to qualified or unqualified versions of a character type or void.

这是 K&R 风格函数定义的情况。它也需要参数和参数之间的数字和类型匹配，因此此类函数是非可变的。

因此，

What is the reason for the rule in the ABI spec? Must unprototyped function calls abide by it given that functions defined with ... (ellipsis) are required to be prototyped?

我想这条规则的原因是传达函数实现需要保存或保存哪些 FP 寄存器。由于通过类型不包含原型的函数表达式调用可变参数函数具有 UB，因此 C 实现没有特别需要遵循该 ABI 规定。

没有 `...` 的可变参数函数

Variadic functions without `...`

c

assembly

x86-64

abi

language-lawyer

C 标准所说的

应用于问题中的原始代码

应用于题中修改后的代码