共享指针会破坏尾调用优化吗?
Does shared pointer break tail call optimization?
前言
我正在练习 C++ 并尝试实现不可变列表。
在我的一个测试中,我试图递归地创建一个包含很多值(100 万个节点)的列表。所有值都是 const
,所以我无法执行常规循环,而且这还不够 functional,你知道的。
测试失败 Segmentation fault
.
我的系统是 64 位 Xubuntu 16.04 LTS Linux 4.4。
我使用 --std=c++14 -O3
标志用 g++ 5.4 和 clang++ 3.8 编译我的代码。
源代码
我写了一个简单的例子,它展示了这种情况,尾调用应该很容易优化,但是出了点问题,出现了Segmentation fault
。函数 f
只是等待 amount
次迭代,然后创建一个指向单个 int
和 returns it
的指针
#include <memory>
using std::shared_ptr;
shared_ptr<int> f(unsigned amount) {
return amount? f(amount - 1) : shared_ptr<int>{new int};
}
int main() {
return f(1E6) != nullptr;
}
注意 此示例仅在 g++
时失败,而 clang++
没问题。
虽然,在更复杂的例子中它也没有优化。
这是一个递归插入元素的简单列表示例。
我还添加了 destroy
函数,这有助于避免销毁过程中的堆栈溢出。
在这里我得到 Segmentation fault
和两个编译器
#include <memory>
using std::shared_ptr;
struct L {
shared_ptr<L> tail;
L(const L&) = delete;
L() = delete;
};
shared_ptr<L> insertBulk(unsigned amount, const shared_ptr<L>& tail) {
return amount? insertBulk(amount - 1, shared_ptr<L>{new L{tail}})
: tail;
}
void destroy(shared_ptr<L> list) {
if (!list) return;
shared_ptr<L> tail = list->tail;
list.reset();
for (; tail; tail = tail->tail);
}
int main() {
shared_ptr<L> list = shared_ptr<L>{new L{nullptr}};
destroy(insertBulk(1E6, list));
return 0;
}
注意
两个编译器都对使用常规指针的实现进行了很好的优化。
问题
shared_ptr
在我的情况下真的会破坏尾调用优化吗?
是编译器的问题还是 shared_ptr
实现的问题?
回答
简短的回答是:是和否。
C++ 中的共享指针不会破坏尾调用优化,
但它使创建这种递归函数变得复杂,编译器可以将其转换为循环。
详情
在递归构造长列表时避免堆栈溢出
我记得 shared_ptr
有一个析构函数,而 C++ 有 RAII。
正如 Can Tail Call Optimization and RAII Co-Exist? 问题中所讨论的那样,这使得可优化尾调用的构造变得更加困难。
@KennyOstrom 提出使用普通指针来解决这个问题
static const List* insertBulk_(unsigned amount, const List* tail=nullptr) {
return amount? insertBulk_(amount - 1, new List{tail})
: tail;
}
使用以下构造函数
List(const List* tail): tail{tail} {}
当List
的tail
是shared_ptr
的实例时,尾调用优化成功
避免销毁过程中的堆栈溢出
需要自定义销毁策略。
幸运的是,shared_ptr
允许我们设置它,
所以我通过 private
隐藏了 List
的析构函数,
并将其用于列表销毁
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
构造函数应将此销毁函数传递给 tail
初始化列表
List(const List* tail): tail{tail, List::destroy} {}
避免内存泄漏
如果出现异常,我不会进行适当的清理,所以问题还没有解决。
我想使用 shared_ptr
因为它是安全的,但现在我不会将它用于当前列表头,直到构建结束。
需要监视"naked"指针,直到它被包裹成共享指针,并在紧急情况下释放它。
让我们传递对尾部指针的引用,而不是将指针本身传递给 insertBulk_
。
这将允许最后一个好指针在函数外可见
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
然后需要Finally
的类似物来自动销毁指针,在异常情况下会泄漏
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
解决方案
现在,我想,问题已经解决了:
g++
和 clang++
成功优化了长列表的递归创建;
- 列表仍然使用
shared_ptr
;
- 普通的指针似乎是安全的。
源代码
最终代码为
#include <memory>
#include <cassert>
using std::shared_ptr;
class List {
private:
const shared_ptr<const List> tail;
/**
* I need a `tail` to be an instance of `shared_ptr`.
* Separate `List` constructor was created for this purpose.
* It gets a regular pointer to `tail` and wraps it
* into shared pointer.
*
* The `tail` is a reference to pointer,
* because `insertBulk`, which called `insertBulk_`,
* should have an ability to free memory
* in the case of `insertBulk_` fail
* to avoid memory leak.
*/
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
unsigned size_(unsigned acc=1) const {
return this->tail? this->tail->size_(acc + 1) : acc;
}
/**
* Destructor needs to be hidden,
* because it causes stack overflow for long lists.
* Custom destruction method `destroy` should be invoked first.
*/
~List() {}
public:
/**
* List needs custom destruction strategy,
* because default destructor causes stack overflow
* in the case of long lists:
* it will recursively remove its items.
*/
List(const List* tail): tail{tail, List::destroy} {}
List(const shared_ptr<const List>& tail): tail{tail} {}
List(const List&) = delete;
List() = delete;
unsigned size() const {
return this->size_();
}
/**
* Public iterface for private `insertBulk_` method.
* It wraps `insertBulk_` result into `shared_ptr`
* with custom destruction function.
*
* Also it creates a guard for tail,
* which will destroy it if something will go wrong.
* `insertBulk_` should store `tail`,
* which is not yet wrapped into `shared_ptr`,
* in the guard, and set it to `nullptr` in the end
* in order to avoid destruction of successfully created list.
*/
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
/**
* Custom destruction strategy,
* which should be called in order to delete a list.
*/
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
/**
* Watching references count allows us to stop,
* when we reached the node,
* which is used by another list.
*
* Also this prevents long loop of construction and destruction,
* because destruction calls this function `destroy` again
* and it will create a lot of redundant entities
* without `tail.use_count() == 1` condition.
*/
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
};
int main() {
/**
* Check whether we can create multiple lists.
*/
const shared_ptr<const List> list{List::insertBulk(1E6)};
const shared_ptr<const List> longList{List::insertBulk(1E7)};
/**
* Check whether we can use a list as a tail for another list.
*/
const shared_ptr<const List> composedList{new List{list}, List::destroy};
/**
* Checking whether creation works well.
*/
assert(list->size() == 1E6);
assert(longList->size() == 1E7);
assert(composedList->size() == 1E6 + 1);
return 0;
}
源代码的较短版本
List
class 没有评论和检查 main
函数
#include <memory>
using std::shared_ptr;
class List {
private:
const shared_ptr<const List> tail;
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
~List() {}
public:
List(const List* tail): tail{tail, List::destroy} {}
List(const shared_ptr<const List>& tail): tail{tail} {}
List(const List&) = delete;
List() = delete;
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
};
前言
我正在练习 C++ 并尝试实现不可变列表。
在我的一个测试中,我试图递归地创建一个包含很多值(100 万个节点)的列表。所有值都是 const
,所以我无法执行常规循环,而且这还不够 functional,你知道的。
测试失败 Segmentation fault
.
我的系统是 64 位 Xubuntu 16.04 LTS Linux 4.4。
我使用 --std=c++14 -O3
标志用 g++ 5.4 和 clang++ 3.8 编译我的代码。
源代码
我写了一个简单的例子,它展示了这种情况,尾调用应该很容易优化,但是出了点问题,出现了Segmentation fault
。函数 f
只是等待 amount
次迭代,然后创建一个指向单个 int
和 returns it
#include <memory>
using std::shared_ptr;
shared_ptr<int> f(unsigned amount) {
return amount? f(amount - 1) : shared_ptr<int>{new int};
}
int main() {
return f(1E6) != nullptr;
}
注意 此示例仅在 g++
时失败,而 clang++
没问题。
虽然,在更复杂的例子中它也没有优化。
这是一个递归插入元素的简单列表示例。
我还添加了 destroy
函数,这有助于避免销毁过程中的堆栈溢出。
在这里我得到 Segmentation fault
和两个编译器
#include <memory>
using std::shared_ptr;
struct L {
shared_ptr<L> tail;
L(const L&) = delete;
L() = delete;
};
shared_ptr<L> insertBulk(unsigned amount, const shared_ptr<L>& tail) {
return amount? insertBulk(amount - 1, shared_ptr<L>{new L{tail}})
: tail;
}
void destroy(shared_ptr<L> list) {
if (!list) return;
shared_ptr<L> tail = list->tail;
list.reset();
for (; tail; tail = tail->tail);
}
int main() {
shared_ptr<L> list = shared_ptr<L>{new L{nullptr}};
destroy(insertBulk(1E6, list));
return 0;
}
注意 两个编译器都对使用常规指针的实现进行了很好的优化。
问题
shared_ptr
在我的情况下真的会破坏尾调用优化吗?
是编译器的问题还是 shared_ptr
实现的问题?
回答
简短的回答是:是和否。
C++ 中的共享指针不会破坏尾调用优化, 但它使创建这种递归函数变得复杂,编译器可以将其转换为循环。
详情
在递归构造长列表时避免堆栈溢出
我记得 shared_ptr
有一个析构函数,而 C++ 有 RAII。
正如 Can Tail Call Optimization and RAII Co-Exist? 问题中所讨论的那样,这使得可优化尾调用的构造变得更加困难。
@KennyOstrom 提出使用普通指针来解决这个问题
static const List* insertBulk_(unsigned amount, const List* tail=nullptr) {
return amount? insertBulk_(amount - 1, new List{tail})
: tail;
}
使用以下构造函数
List(const List* tail): tail{tail} {}
当List
的tail
是shared_ptr
的实例时,尾调用优化成功
避免销毁过程中的堆栈溢出
需要自定义销毁策略。
幸运的是,shared_ptr
允许我们设置它,
所以我通过 private
隐藏了 List
的析构函数,
并将其用于列表销毁
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
构造函数应将此销毁函数传递给 tail
初始化列表
List(const List* tail): tail{tail, List::destroy} {}
避免内存泄漏
如果出现异常,我不会进行适当的清理,所以问题还没有解决。
我想使用 shared_ptr
因为它是安全的,但现在我不会将它用于当前列表头,直到构建结束。
需要监视"naked"指针,直到它被包裹成共享指针,并在紧急情况下释放它。
让我们传递对尾部指针的引用,而不是将指针本身传递给 insertBulk_
。
这将允许最后一个好指针在函数外可见
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
然后需要Finally
的类似物来自动销毁指针,在异常情况下会泄漏
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
解决方案
现在,我想,问题已经解决了:
g++
和clang++
成功优化了长列表的递归创建;- 列表仍然使用
shared_ptr
; - 普通的指针似乎是安全的。
源代码
最终代码为
#include <memory>
#include <cassert>
using std::shared_ptr;
class List {
private:
const shared_ptr<const List> tail;
/**
* I need a `tail` to be an instance of `shared_ptr`.
* Separate `List` constructor was created for this purpose.
* It gets a regular pointer to `tail` and wraps it
* into shared pointer.
*
* The `tail` is a reference to pointer,
* because `insertBulk`, which called `insertBulk_`,
* should have an ability to free memory
* in the case of `insertBulk_` fail
* to avoid memory leak.
*/
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
unsigned size_(unsigned acc=1) const {
return this->tail? this->tail->size_(acc + 1) : acc;
}
/**
* Destructor needs to be hidden,
* because it causes stack overflow for long lists.
* Custom destruction method `destroy` should be invoked first.
*/
~List() {}
public:
/**
* List needs custom destruction strategy,
* because default destructor causes stack overflow
* in the case of long lists:
* it will recursively remove its items.
*/
List(const List* tail): tail{tail, List::destroy} {}
List(const shared_ptr<const List>& tail): tail{tail} {}
List(const List&) = delete;
List() = delete;
unsigned size() const {
return this->size_();
}
/**
* Public iterface for private `insertBulk_` method.
* It wraps `insertBulk_` result into `shared_ptr`
* with custom destruction function.
*
* Also it creates a guard for tail,
* which will destroy it if something will go wrong.
* `insertBulk_` should store `tail`,
* which is not yet wrapped into `shared_ptr`,
* in the guard, and set it to `nullptr` in the end
* in order to avoid destruction of successfully created list.
*/
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
/**
* Custom destruction strategy,
* which should be called in order to delete a list.
*/
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
/**
* Watching references count allows us to stop,
* when we reached the node,
* which is used by another list.
*
* Also this prevents long loop of construction and destruction,
* because destruction calls this function `destroy` again
* and it will create a lot of redundant entities
* without `tail.use_count() == 1` condition.
*/
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
};
int main() {
/**
* Check whether we can create multiple lists.
*/
const shared_ptr<const List> list{List::insertBulk(1E6)};
const shared_ptr<const List> longList{List::insertBulk(1E7)};
/**
* Check whether we can use a list as a tail for another list.
*/
const shared_ptr<const List> composedList{new List{list}, List::destroy};
/**
* Checking whether creation works well.
*/
assert(list->size() == 1E6);
assert(longList->size() == 1E7);
assert(composedList->size() == 1E6 + 1);
return 0;
}
源代码的较短版本
List
class 没有评论和检查 main
函数
#include <memory>
using std::shared_ptr;
class List {
private:
const shared_ptr<const List> tail;
static const List* insertBulk_(unsigned amount, const List*& tail) {
if (!amount) {
const List* result = tail;
tail = nullptr;
return result;
}
return insertBulk_(amount - 1, tail = new List{tail});
}
~List() {}
public:
List(const List* tail): tail{tail, List::destroy} {}
List(const shared_ptr<const List>& tail): tail{tail} {}
List(const List&) = delete;
List() = delete;
static const shared_ptr<const List> insertBulk(unsigned amount) {
struct TailGuard {
const List* ptr;
~TailGuard() {
List::destroy(this->ptr);
}
} guard{};
const List* result = insertBulk_(amount, guard.ptr);
return amount? shared_ptr<const List>{result, List::destroy}
: nullptr;
}
static void destroy(const List* list) {
if (!list) return;
shared_ptr<const List> tail = list->tail;
delete list;
for (; tail && tail.use_count() == 1; tail = tail->tail);
}
};