记录内存访问足迹

Logging Memory Access Footprint

我通过 Dr.Clements 找到了 mtrace。虽然有用,但是在我需要的情况下却不能正常工作。我打算通过记录来了解不同场景下的内存访问模式。

有人可以分享相关经验吗?任何建议将不胜感激。

0313 已更新: 我正在尝试使用 qemu-mtrace 引导 ubuntu 16.04 和 linux-mtrace(3.8.0), 但它只显示几条错误消息并终止。希望有工具可以记录每次访问。

$ ./qemu-system-x86_64 -mtrace-enable -mtrace-file mtrace.out -hda ubuntu.img -m 1024
Error: mtrace_entry_ascope (exit, syscall:xx) with no stack tag!
mtrace_entry_register: mtrace_host_addr failed (10)
mtrace_inst_exec: bad call 140734947607728
Aborted (core dumped)

perf mem 工具为 某些 现代 x86/EM64T CPU s(可能,仅限 Intel;Ivy 和更新 desktop/server中央处理器)。 perf mem 的手册页是 http://man7.org/linux/man-pages/man1/perf-mem.1.html and same text in kernel docs dir: http://lxr.free-electrons.com/source/tools/perf/Documentation/perf-mem.txt. The text is incomplete; the best docs are sources: tools/perf/builtin-mem.c & partially in tools/perf/builtin-report.c. No details in https://perf.wiki.kernel.org/index.php/Tutorial

qemu-mtrace不同,它不会记录每次内存访问,而只会记录每 N 次访问,其中 N 类似于 10000 或 100000。但它以本机速度和低开销工作。使用perf mem record ./program记录花样;尝试添加 -a-C cpulist 用于某些 CPU 内核的系统范围或全局采样。无法记录(跟踪)来自系统内部的所有和每一次内存访问(工具应将信息写入内存并将记录此访问 - 这是有限内存的无限递归),但是有非常昂贵的专有系统特定外部跟踪解决方案,例如 JTAG 或 SDRAM 嗅探器($5k 或更多)。

perf mem的工具是2013年左右添加的(linux内核的3.10版本),在lwn上搜索perf mem有几个结果:https://lwn.net/Articles/531766/

With this patch, it is possible to sample (not trace) memory accesses (load, store). For loads, the instruction and data addresses are captured along with the latency and data source. For stores, the instruction and data addresses are capture along with limited cache and TLB information.

The current patches implement the feature on Intel processors starting with Nehalem. The patches leverage the PEBS Load Latency and Precise Store mechanisms. Precise Store is present only on Sandy Bridge and Ivy Bridge based processors.

添加了物理地址采样支持:https://lwn.net/Articles/555890/ (perf mem --phys-addr -t load rec); (there is also bit related 2016 year c2c perf tool "to track down cacheline contention": https://lwn.net/Articles/704125/ with examples https://joemario.github.io/blog/2016/09/01/c2c-blog/)

perf mem 上的一些随机幻灯片:

关于解码的一些信息perf mem -D reportperf mem -D report

 # PID, TID, IP, ADDR, LOCAL WEIGHT, DSRC, SYMBOL
 2054  2054 0xffffffff811186bf 0x016ffffe8fbffc804b0    49 0x68100842 /lib/modules/3.12.23/build/vmlinux:perf_event_aux_ctx

What does "ADDR", "DSRC", "SYMBOL" mean?

(由与此答案相同的用户回答)

  • IP - PC of the load/store instruction;
  • SYMBOL - name of function, containing this instruction (IP);
  • ADDR - virtual memory address of data, requested by load/store (if there was no --phys-data option)
  • DSRC - "Decoded Source".

还有排序以获得一些基本统计数据:perf mem rep --sort=mem - http://thread.gmane.org/gmane.linux.kernel.perf.user/1438

其他工具.. 有基于 valgrind 的(慢速)cachegrind emulator 用于模拟用户空间程序的高速缓存 - [= 的“7.2 模拟 CPU 高速缓存” 34=]