CUDA 主动扭曲与常驻扭曲
CUDA active warps vs resident warps
CUDA 中的占用率定义为
occupancy = active_warps / maximum_active_warps
resident CUDA warp 和 active 之间有什么区别?
根据我在网络上的研究,似乎一个块在其整个执行期间都驻留在 SM 上(即与其 register/shared 内存文件一起分配)。 "being active"有区别吗?
如果我有一个使用很少寄存器和共享内存的内核..这是否意味着我可以拥有 maximum_active_warps
个常驻块并实现 100% 的占用率,因为占用率仅取决于 [=25] 的数量=] 内存占用?
What is the difference between a resident CUDA warp and an active one?
在这种情况下大概没什么。
From my research on the web it seems that a block is resident (i.e. allocated along with its register/shared memory files) on a SM for the entire duration of its execution. Is there a difference with "being active"?
现在您已经从询问扭曲转为询问块。但同样,在这种情况下,不,您可以认为它们是相同的。
If I have a kernel which uses very few registers and shared memory..
does it mean that I can have maximum_active_warps resident blocks and
achieve 100% occupancy since occupancy just depends on the amount of
register/shared memory used?
不,因为扭曲和块不是一回事。正如您自己引用的那样,占用率是根据扭曲而不是块定义的。 warp 的最大数量固定为 48 或 64,具体取决于您的硬件。最大块数固定为 8、16 或 32,具体取决于硬件。有两个不相同的独立极限。两者都会影响给定内核可以实现的有效占用率。
CUDA 中的占用率定义为
occupancy = active_warps / maximum_active_warps
resident CUDA warp 和 active 之间有什么区别?
根据我在网络上的研究,似乎一个块在其整个执行期间都驻留在 SM 上(即与其 register/shared 内存文件一起分配)。 "being active"有区别吗?
如果我有一个使用很少寄存器和共享内存的内核..这是否意味着我可以拥有 maximum_active_warps
个常驻块并实现 100% 的占用率,因为占用率仅取决于 [=25] 的数量=] 内存占用?
What is the difference between a resident CUDA warp and an active one?
在这种情况下大概没什么。
From my research on the web it seems that a block is resident (i.e. allocated along with its register/shared memory files) on a SM for the entire duration of its execution. Is there a difference with "being active"?
现在您已经从询问扭曲转为询问块。但同样,在这种情况下,不,您可以认为它们是相同的。
If I have a kernel which uses very few registers and shared memory.. does it mean that I can have maximum_active_warps resident blocks and achieve 100% occupancy since occupancy just depends on the amount of register/shared memory used?
不,因为扭曲和块不是一回事。正如您自己引用的那样,占用率是根据扭曲而不是块定义的。 warp 的最大数量固定为 48 或 64,具体取决于您的硬件。最大块数固定为 8、16 或 32,具体取决于硬件。有两个不相同的独立极限。两者都会影响给定内核可以实现的有效占用率。