当两个 std::complex 相乘时，为什么调用 __muldc3？

Question

我天真地假设，复数乘法将由编译器内联，例如对于这个函数：

#include <complex>

void mult(std::complex<double> &a, std::complex<double> &b){
    a*=b;
}

然而，当 gcc 编译时（使用 -O2），resulting assembler 令人惊讶（至少对我而言）：

mult(std::complex<double>&, std::complex<double>&):
        pushq   %rbx
        movsd   8(%rdi), %xmm3
        movsd   (%rdi), %xmm2
        movq    %rdi, %rbx
        movsd   8(%rsi), %xmm1
        movsd   (%rsi), %xmm0
        call    __muldc3
        movsd   %xmm0, (%rbx)
        movsd   %xmm1, 8(%rbx)
        popq    %rbx
        ret

调用此函数 __multdc3，它以某种方式替换了对 operator*= 的调用（它的错位名称将是 _ZNSt7complexIdEmLIdEERS0_RKS_IT_E，复数将按引用传递).

然而，operator*= 的 implementation 似乎没有什么特别之处可以解释魔法：

// 26.2.5/13
  // XXX: This is a grammar school implementation.
  template<typename _Tp>
    template<typename _Up>
    complex<_Tp>&
    complex<_Tp>::operator*=(const complex<_Up>& __z)
    {
      const _Tp __r = _M_real * __z.real() - _M_imag * __z.imag();
      _M_imag = _M_real * __z.imag() + _M_imag * __z.real();
      _M_real = __r;
      return *this;
}

我一定是遗漏了什么，所以我的问题是：产生汇编程序的原因是什么？

Answer 1

可能是因为软件浮点默认。根据 gcc 的手册页：

-msoft-float
This option ignored; it is provided for compatibility purposes only.    
software floating point code is emitted by default,
and this default can overridden by FPX options; mspfp, mspfp-compact, 
or mspfp-fast for single precision, and mdpfp, mdpfp-compact, or mdpfp-fast 
for double precision.

尝试使用 -mdpfp 或任何其他替代方法进行编译。

Answer 2

至少对于普通 C，gcc 遵循复杂 multiply/divide 的有点疯狂的 C99 附件 f (?) 规则；也许也适用于 C++。尝试 -fast-math 或 -fcx-fortran-rules。

Answer 3

需要注意的是，严格来说"wrong"用公式

实现复数浮点乘法

(a+i*b)*(c + i*d) = a*c - b*d + i*(b*c + a*d)

我在引号中写错了，因为C++标准实际上并不需要正确的实现。 C 确实在一些附录中指定了它。

简单的实现在输入 Inf and/or NaN 时没有给出正确的结果。

考虑(Inf + 0*i)*(Inf + 0*i)：显然，对于一致的行为，结果应该与真正的浮点数相同，即分别为Inf或(Inf + 0*i)。但是，上面的公式给出 Inf + i*NaN.

所以我可以想象 __muldc3 是一个更长的函数，可以正确处理这些情况。

当电话消失时 -ffast-math 那么这很可能是解释。

Answer 4

Andreas H.的回答给出了方向，在这个回答中我只是尝试把点连接起来。

首先，我对 operator*= 的定义是错误的：floating types, for example for doubles this operator defined as 有一个专门化：

  template<typename _Tp>
    complex&
    operator*=(const complex<_Tp>& __z)
{
  _ComplexT __t;
  __real__ __t = __z.real();
  __imag__ __t = __z.imag();
  _M_value *= __t;
  return *this;
}

和_ComplexTdefined as

typedef __complex__ double _ComplexT;

和_M_valuedefined as

private:
   _ComplexT _M_value;

这意味着，school-formula 用于 std::complex<int> 等类型，但对于浮动类型，它归结为 C-multiplication （__complex__ 是 gcc-extension)，这是标准的一部分，因为 C99，附录 G（最重要的部分在答案的末尾）。

其中一个有问题的案例是：

(0.0+1.0*j)*(inf+inf*j) = (0.0*inf-1*inf)+(0.0*inf+1.0*inf)j
                        =  nan + nan*j

由于约定 (G.3.1)，第一个因子是非零的、有限的，第二个是无限的，因此由于 (G.5.1.4)，结果必须是有限的，而对于nan+nan*j.

__multdc3 中的计算策略很简单：使用学校公式，如果这不起作用（两者都是 nan），将使用更复杂的方法 - 即太复杂了，无法内联。（顺便说一下，clang 内联学校公式并调用 __multdc3 如果这个公式不起作用）。

另一个问题可能是一个产品 overflow/underflow 但两个产品的总和却没有。这可能包含在“可能引发虚假 floating-point 异常”的表述中，但我不是 100% 确定。

G.3.1 部分指出：

A complex or imaginary value with at least one infinite part is regarded as an infinity (even if its other part is a NaN).

A complex or imaginary value is a finite number if each of its parts is a finite number (neither infinite nor NaN).

A complex or imaginary value is a zero if each of its parts is a zero.

第 G.5 节涉及二元运算符并指出：

For most operand types, the value of the result of a binary operator with an imaginary or complex operand is completely determined, with reference to real arithmetic, by the usual mathematical formula. For some operand types, the usual mathematical formula is problematic because of its treatment of infinities and because of undue overflow or underflow; in these cases the result satisfies certain properties (specified in G.5.1), but is not completely determined.

G.5.1 节与乘法运算符有关，并指出：

The * and / operators satisfy the following infinity properties for all real, imaginary,and complex operands:

—if one operand is an infinity and the other operand is a nonzero finite number or an infinity,then the result of the operator is an infinity;

...

If both operands of the * operator are complex or if the second operand of the / operator is complex, the operator raises floating-point exceptions if appropriate for the calculation of the parts of the result, and may raise spurious floating-point exceptions.

当两个 std::complex 相乘时，为什么调用 __muldc3？

Why is __muldc3 called, when two std::complex are multiplied?

c++

gcc

optimization

complex-numbers