将 getpid() 与 clone() 一起使用会产生 SIGSEGV

Using getpid() with clone() results in SIGSEGV

我正在尝试 运行 一个简单的 clone() 使用下一个代码:

#define _GNU_SOURCE  
#include <linux/sched.h>
#include <stdio.h>
#include <sched.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>

int child_func(void* arg) {

//  printf("Child is running with PID %d\n", getpid());
  printf("Child is running\n");
  return 0;  
} 

int main() {

    printf("main() started\n");

    pid_t child_pid = clone(child_func, malloc(4096), SIGCHLD, NULL);
    pid_t parent_pid = getpid();

    printf("Parent pid: %lu\n", parent_pid);
    printf("Child pid: %lu\n", child_pid);

}

一切都很好:

$ ./clone_example 
main() started
Parent pid: 9200
Child pid: 9201
Child is running

直到我通过添加 getpid() 执行来更改 child_func()

...
int child_func(void* arg) {

  printf("Child is running with PID %d\n", getpid());
//  printf("Child is running\n");
  return 0;  
} 
...

重新编译此代码后 - child_func() 开始失败。

控制台输出如下:

$ ./clone_example 
main() started
Parent pid: 11085
Child pid: 11086

如果 运行 与 strace:

$ strace -o clone_example.log -ff ./clone_example 
main() started
Parent pid: 11655
Child pid: 11656

在线程的日志中 clone_example.log.11656 我看到下一个:

>     --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x562696b1dff8} ---
>     +++ killed by SIGSEGV (core dumped) +++

为什么会这样? 我在这里做错了什么?

来自man 2 clone

The child_stack argument specifies the location of the stack used by the child process. Since the child and calling process may share memory, it is not possible for the child process to execute in the same stack as the calling process. The calling process must therefore set up memory space for the child stack and pass a pointer to this space to clone(). Stacks grow downward on all processors that run Linux (except the HP PA processors), so child_stack usually points to the topmost address of the memory space set up for the child stack.

您的 child 运行 进入分段错误,因为堆栈向下增长并且您将指针传递给新分配的内存区域的开头,而您应该将指针传递给结尾这样的区域。这只会在您添加另一个函数调用 (getpid()) 时发生,因为如果没有该调用,您的 child 进程不会使用那么多堆栈。

正确的调用是:

pid_t child_pid = clone(child_func, ((uint8_t*)malloc(4096)) + 4095, SIGCHLD, NULL);

PS:我猜测对 malloc() 的内联调用只是为了简化示例,但您应该在传递之前检查 malloc() 的 return 值它到 child.