共享指针会破坏尾调用优化吗?

Does shared pointer break tail call optimization?

前言

我正在练习 C++ 并尝试实现不可变列表。 在我的一个测试中,我试图递归地创建一个包含很多值(100 万个节点)的列表。所有值都是 const,所以我无法执行常规循环,而且这还不够 functional,你知道的。 测试失败 Segmentation fault.

我的系统是 64 位 Xubuntu 16.04 LTS Linux 4.4。 我使用 --std=c++14 -O3 标志用 g++ 5.4 和 clang++ 3.8 编译我的代码。

源代码

我写了一个简单的例子,它展示了这种情况,尾调用应该很容易优化,但是出了点问题,出现了Segmentation fault。函数 f 只是等待 amount 次迭代,然后创建一个指向单个 int 和 returns it

的指针
#include <memory>

using std::shared_ptr;

shared_ptr<int> f(unsigned amount) {
    return amount? f(amount - 1) : shared_ptr<int>{new int};
}

int main() {
    return f(1E6) != nullptr;
}

注意 此示例仅在 g++ 时失败,而 clang++ 没问题。 虽然,在更复杂的例子中它也没有优化。

这是一个递归插入元素的简单列表示例。 我还添加了 destroy 函数,这有助于避免销毁过程中的堆栈溢出。 在这里我得到 Segmentation fault 和两个编译器

#include <memory>

using std::shared_ptr;

struct L {
    shared_ptr<L> tail;

    L(const L&) = delete;
    L() = delete;
};

shared_ptr<L> insertBulk(unsigned amount, const shared_ptr<L>& tail) {
    return amount? insertBulk(amount - 1, shared_ptr<L>{new L{tail}})
                 : tail;
}

void destroy(shared_ptr<L> list) {
    if (!list) return;

    shared_ptr<L> tail = list->tail;
    list.reset();

    for (; tail; tail = tail->tail);
}

int main() {
    shared_ptr<L> list = shared_ptr<L>{new L{nullptr}};
    destroy(insertBulk(1E6, list));
    return 0;
}

注意 两个编译器都对使用常规指针的实现进行了很好的优化。

问题

shared_ptr 在我的情况下真的会破坏尾调用优化吗? 是编译器的问题还是 shared_ptr 实现的问题?

回答

简短的回答是:是和否。

C++ 中的共享指针不会破坏尾调用优化, 但它使创建这种递归函数变得复杂,编译器可以将其转换为循环。

详情

在递归构造长列表时避免堆栈溢出

我记得 shared_ptr 有一个析构函数,而 C++ 有 RAII。 正如 Can Tail Call Optimization and RAII Co-Exist? 问题中所讨论的那样,这使得可优化尾调用的构造变得更加困难。

@KennyOstrom 提出使用普通指针来解决这个问题

static const List* insertBulk_(unsigned amount, const List* tail=nullptr) {
    return amount? insertBulk_(amount - 1, new List{tail})
                 : tail;
}

使用以下构造函数

List(const List* tail): tail{tail} {}

Listtailshared_ptr的实例时,尾调用优化成功

避免销毁过程中的堆栈溢出

需要自定义销毁策略。 幸运的是,shared_ptr 允许我们设置它, 所以我通过 private 隐藏了 List 的析构函数, 并将其用于列表销毁

static void destroy(const List* list) {
    if (!list) return;

    shared_ptr<const List> tail = list->tail;
    delete list;
    for (; tail && tail.use_count() == 1; tail = tail->tail);
}

构造函数应将此销毁函数传递给 tail 初始化列表

List(const List* tail): tail{tail, List::destroy} {}

避免内存泄漏

如果出现异常,我不会进行适当的清理,所以问题还没有解决。 我想使用 shared_ptr 因为它是安全的,但现在我不会将它用于当前列表头,直到构建结束。

需要监视"naked"指针,直到它被包裹成共享指针,并在紧急情况下释放它。 让我们传递对尾部指针的引用,而不是将指针本身传递给 insertBulk_。 这将允许最后一个好指针在函数外可见

static const List* insertBulk_(unsigned amount, const List*& tail) {
    if (!amount) {
        const List* result = tail;
        tail = nullptr;
        return result;
    }
    return insertBulk_(amount - 1, tail = new List{tail});
}

然后需要Finally的类似物来自动销毁指针,在异常情况下会泄漏

static const shared_ptr<const List> insertBulk(unsigned amount) {
    struct TailGuard {
        const List* ptr;
        ~TailGuard() {
            List::destroy(this->ptr);
        }
    } guard{};
    const List* result = insertBulk_(amount, guard.ptr);
    return amount? shared_ptr<const List>{result, List::destroy}
                 : nullptr;
}

解决方案

现在,我想,问题已经解决了:

  • g++clang++ 成功优化了长列表的递归创建;
  • 列表仍然使用 shared_ptr;
  • 普通的指针似乎是安全的。

源代码

最终代码为

#include <memory>
#include <cassert>

using std::shared_ptr;

class List {
    private:
        const shared_ptr<const List> tail;

        /**
         * I need a `tail` to be an instance of `shared_ptr`.
         * Separate `List` constructor was created for this purpose.
         * It gets a regular pointer to `tail` and wraps it
         * into shared pointer.
         *
         * The `tail` is a reference to pointer,
         * because `insertBulk`, which called `insertBulk_`,
         * should have an ability to free memory
         * in the case of `insertBulk_` fail
         * to avoid memory leak.
         */
        static const List* insertBulk_(unsigned amount, const List*& tail) {
            if (!amount) {
                const List* result = tail;
                tail = nullptr;
                return result;
            }
            return insertBulk_(amount - 1, tail = new List{tail});
        }
        unsigned size_(unsigned acc=1) const {
            return this->tail? this->tail->size_(acc + 1) : acc;
        }
        /**
         * Destructor needs to be hidden,
         * because it causes stack overflow for long lists.
         * Custom destruction method `destroy` should be invoked first.
         */
        ~List() {}
    public:
        /**
         * List needs custom destruction strategy,
         * because default destructor causes stack overflow
         * in the case of long lists:
         * it will recursively remove its items.
         */
        List(const List* tail): tail{tail, List::destroy} {}
        List(const shared_ptr<const List>& tail): tail{tail} {}
        List(const List&) = delete;
        List() = delete;

        unsigned size() const {
            return this->size_();
        }

        /**
         * Public iterface for private `insertBulk_` method.
         * It wraps `insertBulk_` result into `shared_ptr`
         * with custom destruction function.
         *
         * Also it creates a guard for tail,
         * which will destroy it if something will go wrong.
         * `insertBulk_` should store `tail`,
         * which is not yet wrapped into `shared_ptr`,
         * in the guard, and set it to `nullptr` in the end
         * in order to avoid destruction of successfully created list.
         */
        static const shared_ptr<const List> insertBulk(unsigned amount) {
            struct TailGuard {
                const List* ptr;
                ~TailGuard() {
                    List::destroy(this->ptr);
                }
            } guard{};
            const List* result = insertBulk_(amount, guard.ptr);
            return amount? shared_ptr<const List>{result, List::destroy}
                         : nullptr;
        }
        /**
         * Custom destruction strategy,
         * which should be called in order to delete a list.
         */
        static void destroy(const List* list) {
            if (!list) return;

            shared_ptr<const List> tail = list->tail;
            delete list;

            /**
             * Watching references count allows us to stop,
             * when we reached the node,
             * which is used by another list.
             *
             * Also this prevents long loop of construction and destruction,
             * because destruction calls this function `destroy` again
             * and it will create a lot of redundant entities
             * without `tail.use_count() == 1` condition.
             */
            for (; tail && tail.use_count() == 1; tail = tail->tail);
        }
};

int main() {
    /**
     * Check whether we can create multiple lists.
     */
    const shared_ptr<const List> list{List::insertBulk(1E6)};
    const shared_ptr<const List> longList{List::insertBulk(1E7)};
    /**
     * Check whether we can use a list as a tail for another list.
     */
    const shared_ptr<const List> composedList{new List{list}, List::destroy};
    /**
     * Checking whether creation works well.
     */
    assert(list->size() == 1E6);
    assert(longList->size() == 1E7);
    assert(composedList->size() == 1E6 + 1);
    return 0;
}

源代码的较短版本

List class 没有评论和检查 main 函数

#include <memory>

using std::shared_ptr;

class List {
    private:
        const shared_ptr<const List> tail;

        static const List* insertBulk_(unsigned amount, const List*& tail) {
            if (!amount) {
                const List* result = tail;
                tail = nullptr;
                return result;
            }
            return insertBulk_(amount - 1, tail = new List{tail});
        }
        ~List() {}
    public:
        List(const List* tail): tail{tail, List::destroy} {}
        List(const shared_ptr<const List>& tail): tail{tail} {}
        List(const List&) = delete;
        List() = delete;

        static const shared_ptr<const List> insertBulk(unsigned amount) {
            struct TailGuard {
                const List* ptr;
                ~TailGuard() {
                    List::destroy(this->ptr);
                }
            } guard{};
            const List* result = insertBulk_(amount, guard.ptr);
            return amount? shared_ptr<const List>{result, List::destroy}
                         : nullptr;
        }
        static void destroy(const List* list) {
            if (!list) return;

            shared_ptr<const List> tail = list->tail;
            delete list;

            for (; tail && tail.use_count() == 1; tail = tail->tail);
        }
};