Ubuntu - 如何判断 CPU 应用程序当前正在使用 AVX 还是 SSE?
Ubuntu - how to tell if AVX or SSE, is current being used by CPU app?
我当前 运行 BOINC 跨多个具有 GPU 的服务器。
服务器 运行 GPU 和 CPU BOINC 应用程序。
由于 AVX 和 SSE 在 CPU 应用程序中使用时会降低 CPU 频率,我必须选择 CPU/GPU 我 运行 一起使用,因为一些 GPU 应用程序会出现瓶颈(运行 完成时间较慢),而其他应用程序则不会。
目前一些 CPU 应用被命名,以便清楚地看到它们是否使用 AVX,但大多数没有。
因此,有没有我可以 运行 的任何命令,以及一些查看方式,以查看当前 运行ning 中是否有任何 CPU 应用正在使用 AVX 或 SSE(任何版本)?
另请注意,我是否应该以相同的方式对待任何 FMA 的使用(例如,它是否会因 CPU 温度升高而减慢 CPU 频率)?
谢谢
您可以使用 perf top
查看实时执行的 AVX 和 SSE 指令数以及可执行文件和共享库名称:
perf top -e fp_arith_inst_retired.128b_packed_single -e fp_arith_inst_retired.128b_packed_double -e fp_arith_inst_retired.256b_packed_single -e fp_arith_inst_retired.256b_packed_double
计数器描述(来自 Intel Coffee Lake CPU 上的 perf list
输出):
floating point:
fp_arith_inst_retired.128b_packed_double
[Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired. Each count represents 2 computations. Applies to SSE* and AVX*
packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
multiple calculations per element]
fp_arith_inst_retired.128b_packed_single
[Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
perform multiple calculations per element]
fp_arith_inst_retired.256b_packed_double
[Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
multiple calculations per element]
fp_arith_inst_retired.256b_packed_single
[Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired. Each count represents 8 computations. Applies to SSE* and AVX*
packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
perform multiple calculations per element]
fp_arith_inst_retired.scalar_double
[Number of SSE/AVX computational scalar double precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar double
precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element]
fp_arith_inst_retired.scalar_single
[Number of SSE/AVX computational scalar single precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar single
precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations
per element]
fp_assist.any
[Cycles with any input/output SSE or FP assist]
我当前 运行 BOINC 跨多个具有 GPU 的服务器。
服务器 运行 GPU 和 CPU BOINC 应用程序。
由于 AVX 和 SSE 在 CPU 应用程序中使用时会降低 CPU 频率,我必须选择 CPU/GPU 我 运行 一起使用,因为一些 GPU 应用程序会出现瓶颈(运行 完成时间较慢),而其他应用程序则不会。
目前一些 CPU 应用被命名,以便清楚地看到它们是否使用 AVX,但大多数没有。
因此,有没有我可以 运行 的任何命令,以及一些查看方式,以查看当前 运行ning 中是否有任何 CPU 应用正在使用 AVX 或 SSE(任何版本)?
另请注意,我是否应该以相同的方式对待任何 FMA 的使用(例如,它是否会因 CPU 温度升高而减慢 CPU 频率)?
谢谢
您可以使用 perf top
查看实时执行的 AVX 和 SSE 指令数以及可执行文件和共享库名称:
perf top -e fp_arith_inst_retired.128b_packed_single -e fp_arith_inst_retired.128b_packed_double -e fp_arith_inst_retired.256b_packed_single -e fp_arith_inst_retired.256b_packed_double
计数器描述(来自 Intel Coffee Lake CPU 上的 perf list
输出):
floating point:
fp_arith_inst_retired.128b_packed_double
[Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired. Each count represents 2 computations. Applies to SSE* and AVX*
packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
multiple calculations per element]
fp_arith_inst_retired.128b_packed_single
[Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
perform multiple calculations per element]
fp_arith_inst_retired.256b_packed_double
[Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired. Each count represents 4 computations. Applies to SSE* and AVX*
packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they perform
multiple calculations per element]
fp_arith_inst_retired.256b_packed_single
[Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired. Each count represents 8 computations. Applies to SSE* and AVX*
packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instructions count twice as they
perform multiple calculations per element]
fp_arith_inst_retired.scalar_double
[Number of SSE/AVX computational scalar double precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar double
precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element]
fp_arith_inst_retired.scalar_single
[Number of SSE/AVX computational scalar single precision floating-point instructions retired. Each count represents 1 computation. Applies to SSE* and AVX* scalar single
precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations
per element]
fp_assist.any
[Cycles with any input/output SSE or FP assist]