为什么 ucontext 有这么高的开销？

Question

Boost v1.59 中 Boost.Context 的文档报告了以下性能比较结果：

+----------+----------------------+-------------------+-------------------+----------------+
| Platform |      ucontext_t      |    fcontext_t     | execution_context | windows fibers |
+----------+----------------------+-------------------+-------------------+----------------+
| i386     | 708 ns / 754 cycles  | 37 ns / 37 cycles | ns / cycles       | ns / cycles    |
| x86_64   | 547 ns / 1433 cycles | 8 ns / 23 cycles  | 16 ns / 46 cycles | ns / cycles    |
+----------+----------------------+-------------------+-------------------+----------------+

[link]

我相信the source code for these experiments is hosted on GitHub。

我的问题是，为什么 ucontext 的开销比 Boost 库的实现高 20 倍？我看不出有什么明显的原因会造成如此大的差异。 Boost 实现是否使用了 ucontext 实现者遗漏的一些低级技巧，或者这里发生了其他事情？

Answer 1

Boost 文档说明了为什么 Boost.context 比弃用的 ucontext_t 接口更快。在 Rationale section 中，您会发现这条重要提示：

Note Context switches do not preserve the signal mask on UNIX systems.

并且，在与 Other APIs 中的 makecontext 的比较中：

ucontext_t preserves signal mask between context switches which involves system calls consuming a lot of CPU cycles.

如前所述，swapcontext 确实保留了信号掩码，这需要系统调用和所有随之而来的开销。由于这正是 ucontext_t 功能的重点，因此不能将其描述为疏忽。（如果不想保留信号掩码，可以使用setjmp和longjmp。）

顺便说一下，ucontext_t 函数在 Posix 版本 6 中被弃用并在版本 7 中删除，因为 (1) makecontext 接口需要 C 的过时功能，这在 C++ 中根本不可用； (2) 接口很少使用； (3) 协程可以使用 Posix 个线程来实现。（参见 note in Posix edition 6。）（显然，线程不是实现协程的理想机制，但依赖于过时功能的接口也不是。）

为什么 ucontext 有这么高的开销？

Why does ucontext have such high overhead?

c

c++

boost

ucontext