如何使 perf_event_open() 中的 PERF_COUNT_SW_CONTEXT_SWITCHES 配置生效？

Question

我正在为我编写的软件设置概要分析，但我无法使用 perf_event_open 获得上下文切换计数。

为了测试问题，我也尝试使用 perf_event_open man_page 上提供的示例代码。在同一核心上使用 sched_yield 和运行并行进程，使用任务集强制上下文切换。使用 perf_event_open() 的上下文切换计数仍然为 0。（使用 perf stat 时我得到非零数字：在大循环中有数千个）。我也尝试过读取文件/使用 mmap 来强制页面错误。

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/ioctl.h>
#include <linux/perf_event.h>
#include <asm/unistd.h>
#include <iostream>
#include <string.h>
#include <sys/mman.h>
using namespace std;
int buf_size_shift = 8;

static unsigned perf_mmap_size(int buf_size_shift)
{
    return ((1U << buf_size_shift) + 1) * sysconf(_SC_PAGESIZE);
}

static long
perf_event_open(struct perf_event_attr *hw_event, pid_t pid,
            int cpu, int group_fd, unsigned long flags)
{
        int ret;

        ret = syscall(__NR_perf_event_open, hw_event, pid, cpu,
                      group_fd, flags);
        return ret;
}


int main(int argc, char **argv)
{

       struct perf_event_attr pe;
       long long count;
       int fd;

       memset(&pe, 0, sizeof(struct perf_event_attr));
       pe.type = PERF_TYPE_SOFTWARE;
       //pe.sample_type = PERF_SAMPLE_CALLCHAIN; /* this is what allows you to obtain callchains */

       pe.size = sizeof(struct perf_event_attr);
       pe.config = PERF_COUNT_SW_CONTEXT_SWITCHES;
       pe.disabled = 1;
       pe.exclude_kernel = 1;
       pe.sample_period = 1000;
       pe.exclude_hv = 1;

       fd = perf_event_open(&pe, 0, -1, -1, 0); 
       if (fd == -1) {
          fprintf(stderr, "Error opening leader %llx\n", pe.config);
          exit(EXIT_FAILURE);
       }

       /* associate a buffer with the file */
       struct perf_event_mmap_page *mpage;
       mpage = (perf_event_mmap_page*) mmap(NULL,  perf_mmap_size(buf_size_shift),
        PROT_READ|PROT_WRITE, MAP_SHARED,
       fd, 0);
       if (mpage == (struct perf_event_mmap_page *)-1L) {
        close(fd);
        return -1;
       }

       ioctl(fd, PERF_EVENT_IOC_RESET, 0);
       ioctl(fd, PERF_EVENT_IOC_ENABLE, 0);

       printf("Measuring instruction count for this printf\n");
       long long sum = 0;
       for (long long i = 0; i < 10000000000; i++) {
           sum += i;
           if (i%1000000 == 0)
               cout << i << " : " << sum << endl;
       } 

       ioctl(fd, PERF_EVENT_IOC_DISABLE, 0);
       read(fd, &count, sizeof(long long));

       printf("Used %lld cs\n", count);

       close(fd);
}

type = PERF_COUNT_SOFTWARE 和 config = PERF_COUNT_SW_CONTEXT_SWITCHES 的代码即使在强制上下文切换的情况下也会在计数中输出 0。而其他指标正在发挥作用。

在使用mmap环形缓冲区时，我看到PERF_RECORD_SWITCH条读取它的记录，而根据我的理解是正在记录上下文切换事件。

关于性能计数和环形缓冲区中的数据如何相关的任何信息也很受欢迎。

Answer 1

事件不被统计是因为你禁用了来自内核的事件(exclude_kernel = 1;)，而PERF_TYPE_SOFTWARE事件通常由内核提供。

如果删除 exclude_kernel，事件将被计算在内。

计数与环形缓冲区中记录的事件之间的联系是sample_period。您的 pe.sample_period = 1000; 设置意味着每 1000 个切换事件，一个 PERF_RECORD_SAMPLE 事件被写入环形缓冲区。

以下读取缓冲区的例子只是为了说明一般方法。实际上，您需要处理环绕缓冲区末尾的事件并进行更多的一致性检查。

   auto tail = mpage->data_tail;
   const auto head = mpage->data_head;
   const auto size = mpage->data_size;
   char* data = reinterpret_cast<char*>(mpage) + sysconf(_SC_PAGESIZE);
   int events = 0;
   while (true) {
       if (tail >= head) break;
       auto event_header_p = (struct perf_event_header*)(data + (tail % size));
       std::cout << "event << " << event_header_p->type << ", size: " << event_header_p->size << "\n";
       tail += event_header_p->size;
       events++;
   }

您应该在缓冲区中找到相应数量的 PERF_RECORD_SAMPLE == 9 类型的事件（除非发生溢出）。如果要读取它们，则需要将指针转换为适当的结构。 PERF_RECORD_SAMPLE 事件或任何其他事件的实际布局取决于您的 perf_event_attr 配置并记录在 perf_event_open 中。

如何使 perf_event_open() 中的 PERF_COUNT_SW_CONTEXT_SWITCHES 配置生效？

How to make the PERF_COUNT_SW_CONTEXT_SWITCHES config in perf_event_open() work?

c++

profiling

context-switch

perf