为什么 _do_fork() 的 kretprobe 只 return 一次？

Question

当我用 fork 写一个小脚本时，系统调用 returns 两次进程（每个进程一次）：

#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
    int pid = fork();

    if (pid == 0) {
        // child
    } else if (pid > 0) {
        // parent
    }
}

如果我用 systemtap 对其进行检测，我只会找到一个 return 值：

// fork() in libc calls clone on Linux
probe syscall.clone.return {
    printf("Return from clone\n")
}

（SystemTap installes probes on _do_fork 而不是克隆，但这不会改变任何东西。）

这让我很困惑。几个相关问题：

为什么系统调用只 return 一次？
如果我对 _do_fork code 的理解是正确的，那么这个过程是在函数中间被克隆的。（copy_process 和 wake_up_new_task）。后面的代码运行不应该在两个进程中吗？
系统调用后的内核代码运行是否与系统调用前的用户代码在同一个线程/进程中？

Answer 1

创建 child 可能会失败，因此必须检测和处理错误
child 有不同的 return 值，这也必须处理
可能 parent 有清理/其他操作要做

因此，代码必须区分作为 parent 和 child 执行。但是没有这种检查，这已经是一个强烈的暗示 child 并没有首先执行这段代码。因此应该找一个专门的地方 new children return to.

由于代码又大又长，可以尝试作弊，只需在 arch-specific 代码中查找 'fork'，很快就会显示 ret_from_fork.

起点设置为 -> do_fork -> copy_process -> copy_thread_tls http://lxr.free-electrons.com/source/arch/x86/kernel/process_64.c#L158

因此

Why does the syscall only return once?

它没有 return 一次。有 2 个 returning 线程，除了另一个使用不同的代码路径。由于探测器仅安装在第一个上，您看不到另一个。另见下文。

If I understand the _do_fork code correctly, the process is cloned in the middle of the function. (copy_process and wake_up_new_task). Shouldn't the subsequent code run in both processes?

我之前注意到这是错误的。真正的问题是将 child return 与 parent 放在同一位置有什么好处。我没有看到任何东西，这会很麻烦（额外的特殊外壳，如上所述）。对于 re-state：使 child return elsehwere 让调用者不必处理 returning child。他们只需要检查错误。

Does the kernel code after a syscall run in the same thread / process as the user code before the syscall?

什么是'kernel code after a syscall'？如果你是线程X，进入内核，你还是线程X。

为什么 _do_fork() 的 kretprobe 只 return 一次？

Why does the kretprobe of the _do_fork() only return once?

c

linux

kernel

systemtap

kprobe