glibc 中的getpid 工作程序是什么？

Question

情况如下：

我正在尝试对 github 中的内核进行破解 project。内核版本为linux-3.18.6.

QEMU用于模拟环境

在我的应用程序中，我尝试通过遵循 them.The 的方式来理解系统调用过程，以完成我的目标，就像 shell 程序一样。我只是为运行相关系统调用创建了一些命令。也许通过图片就很简单了。 some commands

代码简单如下：

1 使用 API getpid.

int Getpid(int argc, char **argv)
{
    pid_t pid;
    pid = getpid();
    printf("current process's pid:%d\n",pid);
    return 0;
}

2 直接使用 int $0x80。

int GetpidAsm(int argc, char **argv)
{
    pid_t pid;
    asm volatile(
    "mov , %%eax\n\t"
    "int [=11=]x80\n\t"
    "mov %%eax, %0\n\t"
    :"=m"(pid)
    );
    printf("current process's pid(ASM):%d\n",pid);
    return 0;
}

因为我的应用程序只是运行在 pid 1 的进程中，所以每次我输入命令 getpid 时，它 returns 1。当然是这样。

奇怪的是，当我使用gdb 调试系统调用进程时，它只在我键入getpid 执行时停在berakpoint sys_getpid 一次。当我一遍又一遍的时候，它就不停地输出。

显然，据我所知，int $0x80 的使用是绝对正确的。

为了解决这个问题，我做了一些研究。我下载了 glibc 源代码 (glibc-2.25) 以查看 api getpid 如何包装 int $0x80。不幸的是，它不在那里，或者我只是没有找到合适的位置。

glibc 中的一些代码。

pid_t getpid(void)
{
  pid_t (*f)(void);
  f = (pid_t (*)(void)) dlsym (RTLD_NEXT, "getpid");
  if (f == NULL)
    error (EXIT_FAILURE, 0, "dlsym (RTLD_NEXT, \"getpid\"): %s", dlerror ());
  return (pid2 = f()) + 26;
}

如果我的代码有误，请告诉我，谢谢。

如代码所示，glibc中没有getpid的定义。看了一些资料，有人说the VDSO....

Notice that, AFAIK, a significant part of the cost of simple syscalls is going from user-space to kernel and back. Hence, for some syscalls (probably gettimeofday, getpid ...) the VDSO might avoid even that (and technically might avoid doing a real syscall).

在man getpid pgae:

C library/kernel differences Since glibc version 2.3.4, the glibc wrapper function for getpid() caches PIDs, so as to avoid additional system calls when a process calls getpid() repeatedly. Normally this caching is invisible, but its correct operation relies on support in the wrapper functions for fork(2), vfork(2), and clone(2): if an application bypasses the glibc wrappers for these system calls by using syscall(2), then a call to getpid() in the child will return the wrong value (to be precise: it will return the PID of the parent process). See also clone(2) for dis‐ cussion of a case where getpid() may return the wrong value even when invoking clone(2) via the glibc wrapper function.

虽然有这么多的解释，我还是搞不清楚APIgetpid的工作过程。

作为对比，API时间就很好理解了。时间定义：

time_t
time (time_t *t)
{
  INTERNAL_SYSCALL_DECL (err);
  time_t res = INTERNAL_SYSCALL (time, err, 1, NULL);
  /* There cannot be any error.  */
  if (t != NULL)
    *t = res;
  return res;
}

那么，

#define INTERNAL_SYSCALL(name, err, nr, args...)            \
    internal_syscall##nr ("li\t%0, %2\t\t\t# " #name "\n\t",    \
                  "IK" (SYS_ify (name)),            \
                  0, err, args)

最后是嵌入式asm，内核源码的正常使用方式

#define internal_syscall1(v0_init, input, number, err, arg1)        \
({                                  \
    long _sys_result;                       \
                                    \
    {                               \
    register long __s0 asm ("") __attribute__ ((unused))     \
      = (number);                           \
    register long __v0 asm ("");                  \
    register long __a0 asm ("") = (long) (arg1);          \
    register long __a3 asm ("");                  \
    __asm__ volatile (                      \
    ".set\tnoreorder\n\t"                       \
    v0_init                             \
    "syscall\n\t"                           \
    ".set reorder"                          \
    : "=r" (__v0), "=r" (__a3)                  \
    : input, "r" (__a0)                     \
    : __SYSCALL_CLOBBERS);                      \
    err = __a3;                         \
    _sys_result = __v0;                     \
    }                               \
    _sys_result;                            \
})

谁能解释清楚API getpid 是如何工作的？为什么 getpid 只陷入系统调用 sys_getpid 一次？如果可能，请推荐一些参考资料。

感谢您的帮助。

Answer 1

首先请注意，glibc 源代码几乎无法导航。

正如您所注意到的，文档指出 getpid() 缓存了它的结果。您发现的代码看起来像

pid_t getpid(void)
{
  pid_t (*f)(void);
  f = (pid_t (*)(void)) dlsym (RTLD_NEXT, "getpid");
  if (f == NULL)
    error (EXIT_FAILURE, 0, "dlsym (RTLD_NEXT, \"getpid\"): %s", dlerror ());
  return (pid2 = f()) + 26;
}

只是一个包装器。它查找 getpid 符号，并调用该函数。该功能是您需要找到的。它是 __getpid() 函数的别名，您可以在 sysdeps/unix/sysv/linux/getpid.c 文件中找到该函数，并且也显示在此 post.

的底部

现在 - 您可能正在查看与您当前的 glibc 不匹配的 glibc 源代码 - 据我所知，2016 年 11 月 this commit 中的 getpid() 缓存发生了很大变化告诉更改将成为 2017 年 2 月发布的 glibc-2.25 的一部分

旧的 getpid() 实现缓存了它的值以避免多次调用 getpid() 系统调用，可以在这里看到： http://repo.or.cz/glibc.git/blob/93eb85ceb25ee7aff432ddea0abf559f53d7a5fc:/sysdeps/unix/sysv/linux/getpid.c 看起来像

static inline __attribute__((always_inline)) pid_t
really_getpid (pid_t oldval)
{
  if (__glibc_likely (oldval == 0))
    {
      pid_t selftid = THREAD_GETMEM (THREAD_SELF, tid);
      if (__glibc_likely (selftid != 0))
    return selftid;
    }

  INTERNAL_SYSCALL_DECL (err);
  pid_t result = INTERNAL_SYSCALL (getpid, err, 0);

  /* We do not set the PID field in the TID here since we might be
     called from a signal handler while the thread executes fork.  */
  if (oldval == 0)
    THREAD_SETMEM (THREAD_SELF, tid, result);
  return result;
}
#endif

pid_t
__getpid (void)
{
#if !IS_IN (libc)
  INTERNAL_SYSCALL_DECL (err);
  pid_t result = INTERNAL_SYSCALL (getpid, err, 0);
#else
  pid_t result = THREAD_GETMEM (THREAD_SELF, pid);
  if (__glibc_unlikely (result <= 0))
    result = really_getpid (result);
#endif
  return result;
}

libc_hidden_def (__getpid)
weak_alias (__getpid, getpid)
libc_hidden_def (getpid)

glibc 中的getpid 工作程序是什么？

what's the getpid work procedure in glibc?

linux

glibc

system-calls

linux-kernel

vdso