NPTL 中的线程如何退出？

Question

从实现的角度来看，我很好奇单个 NPTL 线程是如何退出的。

我对 glibc-2.30 的实现的理解是：

NPTL 线程建立在 Linux 上的轻量级进程之上，附加信息存储在用户堆栈上的 pthread 对象中，以跟踪 NPTL 特定信息，例如 join/detach 状态并返回对象的指针。
当一个 NPTL 线程完成时，它就永远消失了，只有用户堆栈（以及因此）pthread 对象被收集（由其他线程加入），除非它是分离的，其中space直接释放的情况
_exit() 系统调用杀死线程组中的所有线程。
pthread_create() takes in is actually wrapped into another function start_thread() 的用户函数，它在运行用户函数之前做一些准备，然后做一些清理。

问题是：

包装函数start_thread()的末尾有如下注释和代码：

/* We cannot call '_exit' here.  '_exit' will terminate the process.

 The 'exit' implementation in the kernel will signal when the
 process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
 flag.  The 'tid' field in the TCB will be set to zero.

 The exit code is zero since in case all threads exit by calling
 'pthread_exit' the exit status must be 0 (zero).  */
 __exit_thread ();

但是 __exit_thread() 似乎无论如何都会进行系统调用 _exit():

 static inline void __attribute__ ((noreturn, always_inline, unused))
 __exit_thread (void)
 {
   /* some comments here */
   while (1)
     {
       INTERNAL_SYSCALL_DECL (err);
       INTERNAL_SYSCALL (exit, err, 1, 0);
     }
 }

所以我在这里很困惑，因为它不应该真正执行系统调用 _exit() 因为它会终止所有线程。

pthread_exit() should terminate a single thread, so it should do something similar to what the wrapper start_thread() does in the end, however it calls __do_cancel()，老实说，我在追查该函数时迷路了。好像和上面的__exit_thread()没有关系，也不叫_exit().

Answer 1

I'm confused here, since it shouldn't really do syscall _exit()

这里的混淆源于将 exit 系统调用与 _exit libc 例程混合（在 Linux 上没有 _exit 系统调用）。

前者终止当前 Linux 线程（按预期）。

后者（令人困惑）不执行 exit 系统调用。相反，它执行 exit_group 系统调用，终止所有线程。

thread_exit() should terminate a single thread

确实如此，间接的。它展开当前堆栈（类似于 siglongjmp），执行控制转移到设置 cleanup_jmp_buf 的点。那是在 start_thread.

控制权转移后，start_thread清理资源，并调用__exit_thread真正终止线程。

NPTL 中的线程如何退出？

How does a thread in NPTL exit?

c

glibc

pthreads

system-calls

linux-kernel