为什么 nvidia 控制面板和设备查询之间的 cuda 核心不同？

Question

Q1：为什么我从Nvidia控制面板->系统信息和cuda sdk中的设备查询示例中得到的信息不同。

系统信息：

设备查询输出：

问题 2：如何使用设备查询数据计算我的 GPU 的 GFLOP？我发现最常用的公式是提到的那个 here，它建议使用 Number of mul-add units, number of mul units which I don't know?

最大 GFLOPS（内核 x SIMD x ([mul-add]x2+[mul]*1)* 时钟速度）

Answer 1

Q1：它告诉您就在线上方...

MapSMtoCores for SM 5.0 is unefined. Default to use 192 Cores/SM

Maxwell，GeForce 840M 背后的架构，每 "SMM"

使用 128 "cores"

3 * 128 = 384

Q2："Cores" * frequency * 2（因为每个核心都可以做乘法+加法）

why the difference in cuda cores between nvidia control panel and device query?