代码执行"latencies"从何而来？

Question

我遇到的问题是代码执行经常出现延迟，我无法解释。对于延迟，我的意思是执行一段应该需要恒定时间的代码，有时需要更多时间。

我附加了一个小的 C 程序，它在 CPU 核心 1 上执行一些“虚拟”计算。线程固定到这个核心。我已经在具有 192 GiB RAM 和 96 CPU 内核的 Ubuntu 18.04 机器上执行了它。这台机器什么都不做。

该工具只运行一个线程（主线程正在休眠）并且至少perf工具没有显示开关（线程开关），所以这应该不是问题。

该工具的输出如下所示（或多或少每秒显示一次）：

...

Stats:
 Max [us]: 883
 Min [us]: 0
 Avg [us]: 0.022393

...

这些统计信息始终显示 1'000'000 次运行的结果。我的问题是为什么最大值总是那么大？此外，99.99% 的分位数通常很大（我没有将它们添加到示例中以使代码变小；最大值也很好地显示了这种行为）。为什么会发生这种情况，我该如何避免？在某些应用程序中，这种“差异”对我来说是个大问题。

鉴于没有别的东西运行，我很难理解这些值。

非常感谢

main.c:

#define _GNU_SOURCE

#include <stdio.h>
#include <stdbool.h>
#include <sys/time.h>
#include <pthread.h>
#include <sys/sysinfo.h>

static inline unsigned long now_us()
{
    struct timeval tx;
    gettimeofday(&tx, NULL);
    return tx.tv_sec * 1000000 + tx.tv_usec;
}

static inline int calculate(int x)
{
    /* Do something "expensive" */
    for (int i = 0; i < 1000; ++i) {
        x = (~x * x + (1 - x)) ^ (13 * x);
        x += 2;
    }
    return x;
}

static void *worker(void *arg)
{
    (void)arg;

    const int runs_per_measurement = 1000000;
    int dummy = 0;
    while (true) {
        int max_us = -1;
        int min_us = -1;
        int sum_us = 0;
        for (int i = 0; i < runs_per_measurement; ++i) {
            const long start_us = now_us();
            dummy = calculate(dummy);
            const long runtime_us = now_us() - start_us;
            
            /* Update stats */
            if (max_us < runtime_us) {
                max_us = runtime_us;
            }
            if (min_us < 0 || min_us > runtime_us) {
                min_us = runtime_us;
            }
            sum_us += runtime_us;
        }
        printf("Stats:\n");
        printf(" Max [us]: %d\n", max_us);
        printf(" Min [us]: %d\n", min_us);
        printf(" Avg [us]: %f\n", (double)sum_us / runs_per_measurement);
        printf("\n");
    }

    return NULL;
}

int main()
{
    pthread_t worker_thread;

    if (pthread_create(&worker_thread, NULL, worker, NULL) != 0) {
        printf("Cannot create thread!\n");
        return 1;
    }

    /* Use CPU number 1 */
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(1, &cpuset);

    if (pthread_setaffinity_np(worker_thread, sizeof(cpuset), &cpuset) != 0) {
        printf("Cannot set cpu core!\n");
        return 1;
    }

    pthread_join(worker_thread, NULL);

    return 0;
}

生成文件：

main: main.c
    gcc -o $@ $^ -Ofast -lpthread -Wall -Wextra -Werror

Answer 1

这是多处理如何在操作系统中工作的一个很好的例子。

如上评论所述：

"This machine does nothing else" --> absurd. Run ps -e to get an idea of all the other things your machine is doing. – John Bollinger

这是通过操作系统（特别是内核）让一个任务运行执行一段时间，然后暂停它并允许另一个任务运行.

来实现的

有效地，您的代码先运行一段时间，然后其他运行暂停，然后运行一段时间，依此类推。

这说明了您看到的时间变化，因为您测量的是经过的时间，而不是 'cpu-time'（实际花费的时间运行ning）。 C 有一些用于测量 cpu 时间的标准函数，例如来自 GNU

的 this

CPU 更详细地介绍了调度 here

最后，为了不成为Pre-empted，您需要运行您的代码：Kenel-space、[=41] =]，或在 'real-time' 操作系统中。（我会让你 google 这些术语是什么意思 :-) ）

唯一的其他解决方案是探索 linux/unix 'nice values'（我也会让你 google 这个，但基本上它会为你的进程分配更高或更低的优先级.)

如果您对这类事情感兴趣，可以阅读 Robert Love 的一本名为 'Linux Kernel Development'

的好书

代码执行"latencies"从何而来？

Where do code execution "latencies" come from?

c

linux

latency

scheduler