递归函数的内联

Inlining of a recursive function

当我尝试编译这段代码时:

#include <iostream>
#include <limits.h>

// End recursive template-expansion of function select below.
template <typename Type>
static inline constexpr Type select(unsigned index)
{ return Type(); }

// Select one of the items passed to it.
// e.g. select(0, a, b, c) = a; select(1, a, b, c) = b; etc.
template <typename Type, typename... Params>
[[gnu::always_inline]]
static inline constexpr Type select(unsigned index, Type value, Params... values)
{ return index == 0 ? value : select<Type>(index - 1, values...); }

template <typename Type>
[[gnu::always_inline]]
static inline constexpr Type reflect_mask_helper_1(Type mask, Type shift, Type value)
{ return ((value & mask) >> shift) | ((value << shift) & mask); }

template <typename Type>
[[gnu::always_inline]]
static inline constexpr Type reflect_mask_helper_0(unsigned i, Type value)
{
  return i == 0
    ? value
    : reflect_mask_helper_0(
        i - 1,
        reflect_mask_helper_1<Type>(
          select(i - 1, 0xaaaaaaaaaaaaaaaa, 0xcccccccccccccccc, 0xf0f0f0f0f0f0f0f0,
                        0xff00ff00ff00ff00, 0xffff0000ffff0000, 0xffffffff00000000),
          1 << (i - 1),
          value));
}

template <typename Type>
[[gnu::flatten]]
static inline constexpr Type reflect_mask(Type value)
{ return reflect_mask_helper_0(__builtin_ctz(sizeof(Type) * CHAR_BIT), value); }

int main(void) {
  for (int i = 0; i < 65536; i++) {
    std::cout << reflect_mask<uint16_t>(i) << std::endl;
  }
}

gcc 给我一个错误,说函数 reflect_mask_helper_0 不能内联,因为它是递归的。然而,函数 select 也是递归的,但 gcc 没有抱怨地内联它。我在这里错过了什么?

(我需要它是递归的,因为 constexpr 函数在 C++11 下不能包含循环。)

错误信息:

% g++ test.cpp -O3 -march=native -c
test.cpp: In function ‘constexpr Type reflect_mask_helper_0(unsigned int, Type) [with Type = short unsigned int]’:
test.cpp:23:30: error: inlining failed in call to always_inline ‘constexpr Type reflect_mask_helper_0(unsigned int, Type) [with Type = short unsigned int]’: recursive inlining
   23 | static inline constexpr Type reflect_mask_helper_0(unsigned i, Type value)
      |                              ^~~~~~~~~~~~~~~~~~~~~
test.cpp:27:28: note: called from here
   27 |     : reflect_mask_helper_0(
      |       ~~~~~~~~~~~~~~~~~~~~~^
   28 |         i - 1,
      |         ~~~~~~              
   29 |         reflect_mask_helper_1<Type>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   30 |           select(i - 1, 0xaaaaaaaaaaaaaaaa, 0xcccccccccccccccc, 0xf0f0f0f0f0f0f0f0,
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   31 |                         0xff00ff00ff00ff00, 0xffff0000ffff0000, 0xffffffff00000000),
      |                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   32 |           1 << (i - 1),
      |           ~~~~~~~~~~~~~     
   33 |           value));
      |           ~~~~~~~           
test.cpp: In function ‘int main()’:
test.cpp:23:30: error: inlining failed in call to always_inline ‘constexpr Type reflect_mask_helper_0(unsigned int, Type) [with Type = short unsigned int]’: recursive inlining
   23 | static inline constexpr Type reflect_mask_helper_0(unsigned i, Type value)
      |                              ^~~~~~~~~~~~~~~~~~~~~
test.cpp:27:28: note: called from here
   27 |     : reflect_mask_helper_0(
      |       ~~~~~~~~~~~~~~~~~~~~~^
   28 |         i - 1,
      |         ~~~~~~              
   29 |         reflect_mask_helper_1<Type>(
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   30 |           select(i - 1, 0xaaaaaaaaaaaaaaaa, 0xcccccccccccccccc, 0xf0f0f0f0f0f0f0f0,
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   31 |                         0xff00ff00ff00ff00, 0xffff0000ffff0000, 0xffffffff00000000),
      |                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   32 |           1 << (i - 1),
      |           ~~~~~~~~~~~~~     
   33 |           value));
      |           ~~~~~~~

select 实际上并不调用 自身 。它弹出它收到的类型列表的前面,然后调用 select<Type, ...> 的另一个特化 。尾随参数包不同。由于 "recursion" 本质上是一组有限的嵌套函数调用(不同的函数),GCC 可以直接看穿它,而不管 运行-time 参数如何。

但是 reflect_mask_helper_0 会无限期地使用相同的模板参数调用 自身 。 GCC 无法判断此 运行 时间的递归在 运行 时间的深度。回想一下,constexpr 函数仍然是一个必须在 运行 时间调用的常规函数​​。

如果您查看生成的汇编代码,如果您删除 always_inlineflatten 属性,您会发现 gcc 实际上正确地内联了所有内容。

所以,这个问题是 QoI 问题。也许,在那一点上,当 always_inline 被处理时,它不能被内联(因此出现错误消息),但是 gcc 决定无论如何都要内联它。

顺便说一下,你可以微调 gcc,稍微修改你的代码,gcc 就可以编译它:

  • --param max-early-inliner-iterations=3传递给gcc
  • 删除 flatten 属性(不知道,为什么它很重要...)

(所以,实际上,这个问题与递归调用无关 - 从编译器的角度来看,函数是否递归并不重要,它只是遵循代码的流程 - 到一定程度上,当然。这里,递归深度只有 4,对于编译器来说并不难遵循)

这是我找到的解决方案,感谢 and to

(至于我之前在编译后的二进制文件中留下未使用的函数模板实例的问题,我通过编译原始代码解决了这个问题——没有 gnu::always_inlinegnu::flatten 属性——带有参数-ffunction-sections -fdata-sections -Wl,--gc-sections.)

现在 reflect_mask_helper_0struct 中(因为 C++ 不允许函数模板的部分特化),函数的 i 参数变成了 Index struct 模板的参数。

#include <iostream>
#include <limits.h>

// End recursive template-expansion of function select below.
template <typename Type>
static inline constexpr Type select(unsigned index)
{ return Type(); }

// Select one of the items passed to it.
// e.g. select(0, a, b, c) = a; select(1, a, b, c) = b; etc.
template <typename Type, typename... Params>
[[gnu::always_inline]]
static inline constexpr Type select(unsigned index, Type value, Params... values)
{ return index == 0 ? value : select<Type>(index - 1, values...); }

template <typename Type>
[[gnu::always_inline]]
static inline constexpr Type reflect_mask_helper_1(Type mask, Type shift, Type value)
{ return ((value & mask) >> shift) | ((value << shift) & mask); }

template <typename Type, unsigned Index>
struct reflect_mask_helper_0
{
  [[gnu::always_inline]]
  static inline constexpr Type invoke(Type value)
  {
    return reflect_mask_helper_0<Type, Index - 1>::call(
      reflect_mask_helper_1<Type>(
        static_cast<Type>(select(Index - 1,
          0xaaaaaaaaaaaaaaaa, 0xcccccccccccccccc, 0xf0f0f0f0f0f0f0f0,
          0xff00ff00ff00ff00, 0xffff0000ffff0000, 0xffffffff00000000)),
        1 << (Index - 1),
        value));
  }
};

template <typename Type>
struct reflect_mask_helper_0<Type, 0>
{
  [[gnu::always_inline]]
  static inline constexpr Type invoke(Type value) { return value; }
};

template <typename Type>
static inline constexpr Type reflect_mask(Type value)
{ return reflect_mask_helper_0<Type, __builtin_ctz(sizeof(Type) * CHAR_BIT)>::invoke(value); }

int main(void) {
  for (int i = 0; i < 65536; i++) {
    std::cout << reflect_mask<uint16_t>(i) << std::endl;
  }
}