在哪里可以找到 perf 事件文档
Where to find perf event document
在问题cpu cache performance. store misses vs load misses中,没有回答在哪里可以找到perf list
列出的事件的文档
我无法通过 man perf
和 perf help list
、
找到它
看了Intel@64和AMD64的Event文档,事件格式如下
Last Level Cache References — Event select 2EH, Umask 4FH
那么它在哪里呢?
编辑:为了清楚起见,我想通过perf list
查找事件列表的文档
预定义 perf
事件列表,如 branches
cycles
LLC-load-misses
由 Linux 内核中的 perf 子系统的源代码记录。该列表部分映射到不同 CPU 模型和微体系结构的各种硬件事件。使用 ocperf.py
(and toplev.py) from andikleen's pmu-tools (if your CPU is Intel) with event names from Intel documentations (ocperf is not official, but it is written by Intel employee and uses official lists from https://download.01.org/perfmon/ https://download.01.org/perfmon/readme.txt "This package contains performance monitoring event lists for Intel processors")
会更有用
对于 x86 和 x86_64 perf
这些(古老的)predefined/generic 名称映射到 arch/x86/events
directory, for example for all Intel Core microarchitecures check arch/x86/events/intel/core.c
and search for microarchitecture by its code name (Core, Core2, NHM=Nehalem, WSM=Westmere, SNB=SandyBridge, IVB=IvyBridge, HSW=HaSWell, BDW=BroaDWell,SKL=SKyLake, SLM=SiLverMont and other from lists and amd). For Skylake there is structure at line 394 of intel/core.c of 4.15.8,我们看到 PREFETCH 计数器未映射到所有缓存( "not supported")
static __initconst const u64 skl_hw_cache_event_ids
[ C(L1D ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_INST_RETIRED.ALL_LOADS */
[ C(RESULT_MISS) ] = 0x151, /* L1D.REPLACEMENT */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_INST_RETIRED.ALL_STORES */
[ C(RESULT_MISS) ] = 0x0,
...
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
和额外的结构来为 OFFCORE_RESPONSE:
等事件定义额外的 flags/masks
static __initconst const u64 skl_hw_cache_extra_regs
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
[ C(NODE) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,
在问题cpu cache performance. store misses vs load misses中,没有回答在哪里可以找到perf list
列出的事件的文档我无法通过 man perf
和 perf help list
、
看了Intel@64和AMD64的Event文档,事件格式如下
Last Level Cache References — Event select 2EH, Umask 4FH
那么它在哪里呢?
编辑:为了清楚起见,我想通过perf list
预定义 perf
事件列表,如 branches
cycles
LLC-load-misses
由 Linux 内核中的 perf 子系统的源代码记录。该列表部分映射到不同 CPU 模型和微体系结构的各种硬件事件。使用 ocperf.py
(and toplev.py) from andikleen's pmu-tools (if your CPU is Intel) with event names from Intel documentations (ocperf is not official, but it is written by Intel employee and uses official lists from https://download.01.org/perfmon/ https://download.01.org/perfmon/readme.txt "This package contains performance monitoring event lists for Intel processors")
对于 x86 和 x86_64 perf
这些(古老的)predefined/generic 名称映射到 arch/x86/events
directory, for example for all Intel Core microarchitecures check arch/x86/events/intel/core.c
and search for microarchitecture by its code name (Core, Core2, NHM=Nehalem, WSM=Westmere, SNB=SandyBridge, IVB=IvyBridge, HSW=HaSWell, BDW=BroaDWell,SKL=SKyLake, SLM=SiLverMont and other from lists and amd). For Skylake there is structure at line 394 of intel/core.c of 4.15.8,我们看到 PREFETCH 计数器未映射到所有缓存( "not supported")
static __initconst const u64 skl_hw_cache_event_ids
[ C(L1D ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_INST_RETIRED.ALL_LOADS */
[ C(RESULT_MISS) ] = 0x151, /* L1D.REPLACEMENT */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0, /* MEM_INST_RETIRED.ALL_STORES */
[ C(RESULT_MISS) ] = 0x0,
...
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
和额外的结构来为 OFFCORE_RESPONSE:
等事件定义额外的 flags/masksstatic __initconst const u64 skl_hw_cache_extra_regs
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_LLC_ACCESS|SKL_ANY_SNOOP,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS|SKL_ANY_SNOOP|
SKL_SUPPLIER_NONE,
[ C(NODE) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_READ|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_LOCAL_DRAM|SKL_SNOOP_DRAM,
[ C(RESULT_MISS) ] = SKL_DEMAND_WRITE|
SKL_L3_MISS_REMOTE|SKL_SNOOP_DRAM,