大量元素时 Vulkan Buffer WorkGroupID 不返回实际值
Vulkan Buffer WorkGroupID not returning actual value when large number of elements
使用 pow(2, 24)
和 local_size_x = 64
为布局输入限定符创建缓冲区将 return WorkGroupID = 262143
由于 pow(2,24) / 64 - 1
,这一切都很好索引为零。
但是,如果我们将问题的全局维度/无元素/大小增加到 pow(2, 25)
可以说 WorkGroupID
将毫无理由地 return 值,它们与数学不匹配.
这里有一些我认为重要的设备限制:
maxStorageBufferRange: uint32_t = 4294967295
maxComputeSharedMemorySize: uint32_t = 32768
maxComputeWorkGroupCount: uint32_t[3] = 00000202898A8EC4
maxComputeWorkGroupCount[0]: uint32_t = 65535
maxComputeWorkGroupCount[1]: uint32_t = 65535
maxComputeWorkGroupCount[2]: uint32_t = 65535
maxComputeWorkGroupInvocations: uint32_t = 1024
maxComputeWorkGroupSize: uint32_t[3] = 00000202898A8ED4
maxComputeWorkGroupSize[0]: uint32_t = 1024
maxComputeWorkGroupSize[1]: uint32_t = 1024
maxComputeWorkGroupSize[2]: uint32_t = 1024
我不会过度分配设备支持的更多元素。
所以在 2 天 + 16 小时后我仍然没有弄清楚发生了什么......
WorkGroupSize
、WorkGroupID
、LocalInvocationID
和 GlobalInvocationID
当我到达 n 号时出现同样的问题。的元素。难怪 GlobalInvocationID
由于其计算方式而出现相同的问题...
#version 450
// Size of the Local Work-group is defined trough input layout qualifier
layout(local_size_x = 64, local_size_y = 1, local_size_z = 1) in;
layout(set = 0, binding = 0) buffer deviceBuffer
{
uint x[];
};
void main() {
uint i = gl_GlobalInvocationID.x;
//uint i = gl_WorkGroupSize.x * gl_WorkGroupID.x * gl_LocalInvocationID.x;
//x[i] += x[i];
// Total No. of Work Items (threads) in Global Dimension
//x[i] = gl_NumWorkGroups.x;
// Size of Work Dimension specified in Input Layout Qualifier
//x[i] = gl_WorkGroupSize.x;
// Is given by Global Dimension / Work Group Size
x[i] = gl_WorkGroupID.x;
//x[i] = gl_LocalInvocationID.x;
}
maxComputeWorkGroupCount[0]: uint32_t = 65535
maxComputeWorkGroupCount[1]: uint32_t = 65535
maxComputeWorkGroupCount[2]: uint32_t = 65535
vkCmdDispatch
have the size in x = pow(2, 25), y = 1, z = 1
根据您提供的信息groupCountX
= 225 = 33554432,但限制是maxComputeWorkGroupCount[0]
= 65535 = 2 16-1.
Vulkan 规范 Valid Usage for vkCmdDispatch 说:
groupCountX
must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[0]
违反有效用法是未定义的行为。 "Undefined behavior" 表示从 "everything seemingly working fine" 到 "your PC colapses into a black hole and destroys this solar system" 的任何内容。就所有意图和目的而言,违反有效用法是应用程序代码的逻辑错误。
使用 pow(2, 24)
和 local_size_x = 64
为布局输入限定符创建缓冲区将 return WorkGroupID = 262143
由于 pow(2,24) / 64 - 1
,这一切都很好索引为零。
但是,如果我们将问题的全局维度/无元素/大小增加到 pow(2, 25)
可以说 WorkGroupID
将毫无理由地 return 值,它们与数学不匹配.
这里有一些我认为重要的设备限制:
maxStorageBufferRange: uint32_t = 4294967295
maxComputeSharedMemorySize: uint32_t = 32768
maxComputeWorkGroupCount: uint32_t[3] = 00000202898A8EC4
maxComputeWorkGroupCount[0]: uint32_t = 65535
maxComputeWorkGroupCount[1]: uint32_t = 65535
maxComputeWorkGroupCount[2]: uint32_t = 65535
maxComputeWorkGroupInvocations: uint32_t = 1024
maxComputeWorkGroupSize: uint32_t[3] = 00000202898A8ED4
maxComputeWorkGroupSize[0]: uint32_t = 1024
maxComputeWorkGroupSize[1]: uint32_t = 1024
maxComputeWorkGroupSize[2]: uint32_t = 1024
我不会过度分配设备支持的更多元素。 所以在 2 天 + 16 小时后我仍然没有弄清楚发生了什么......
WorkGroupSize
、WorkGroupID
、LocalInvocationID
和 GlobalInvocationID
当我到达 n 号时出现同样的问题。的元素。难怪 GlobalInvocationID
由于其计算方式而出现相同的问题...
#version 450
// Size of the Local Work-group is defined trough input layout qualifier
layout(local_size_x = 64, local_size_y = 1, local_size_z = 1) in;
layout(set = 0, binding = 0) buffer deviceBuffer
{
uint x[];
};
void main() {
uint i = gl_GlobalInvocationID.x;
//uint i = gl_WorkGroupSize.x * gl_WorkGroupID.x * gl_LocalInvocationID.x;
//x[i] += x[i];
// Total No. of Work Items (threads) in Global Dimension
//x[i] = gl_NumWorkGroups.x;
// Size of Work Dimension specified in Input Layout Qualifier
//x[i] = gl_WorkGroupSize.x;
// Is given by Global Dimension / Work Group Size
x[i] = gl_WorkGroupID.x;
//x[i] = gl_LocalInvocationID.x;
}
maxComputeWorkGroupCount[0]: uint32_t = 65535 maxComputeWorkGroupCount[1]: uint32_t = 65535 maxComputeWorkGroupCount[2]: uint32_t = 65535
vkCmdDispatch
have the size in x = pow(2, 25), y = 1, z = 1
根据您提供的信息groupCountX
= 225 = 33554432,但限制是maxComputeWorkGroupCount[0]
= 65535 = 2 16-1.
Vulkan 规范 Valid Usage for vkCmdDispatch 说:
groupCountX
must be less than or equal toVkPhysicalDeviceLimits::maxComputeWorkGroupCount[0]
违反有效用法是未定义的行为。 "Undefined behavior" 表示从 "everything seemingly working fine" 到 "your PC colapses into a black hole and destroys this solar system" 的任何内容。就所有意图和目的而言,违反有效用法是应用程序代码的逻辑错误。