在 DX12 中,多个 ExecuteCommandLists 调用提供什么排序保证?
In DX12 what Ordering Guarantees do multiple ExecuteCommandLists calls provide?
假设一个单线程应用程序。如果您调用 ExecuteCommandLists
两次(A 和 B)。 A 是否保证在启动来自 B 的任何命令之前在 GPU 上执行其所有命令?我在文档中找到的最接近的东西是这个,但它似乎并不能真正保证 A 在 B 开始之前完成:
Applications can submit command lists to any command queue from multiple threads. The runtime will perform the work of serializing these requests in the order of submission.
作为比较点,我知道这在 Vulkan 中明确不保证:
vkQueueSubmit is a queue submission command, with each batch defined by an element of pSubmits as an instance of the VkSubmitInfo structure. Batches begin execution in the order they appear in pSubmits, but may complete out of order.
但是,我不确定 DX12 是否以同样的方式工作。
The command lists are executed in order starting with the first array element
然而,在那种情况下,他说的是用两个命令列表(C 和 D 调用一次 ExecuteCommandLists
。这些操作是否与两个单独的呼叫相同?我的同事争辩说,这仍然只能保证它们按顺序启动,而不是 C 在 D 开始之前完成。
是否有我遗漏的更清晰的文档?
我在 Direct X 论坛上问过同样的问题,这里是微软工程师 Jesse Natalie 的回答:
Calling ExecuteCommandLists twice guarantees that the first workload
(A) finishes before the second workload (B). Calling
ExecuteCommandLists with two command lists allows the driver to merge
the two command lists such that the second command list (D) may begin
executing work before all work from the first (C) has finished.
Specifically, the application is allowed to insert a fence signal or
wait between A and B, and the driver has no visibility into this, so
the driver must ensure that everything in A is complete before the
fence operation. There is no such opportunity in a single call to the
API, so the driver can optimize that scenario.
Finally the ID3D12CommandQueue is a first-in first-out queue, that stores the correct order of the command lists for submission to the GPU. Only when one command list has completed execution on the GPU, will the next command list from the queue be submitted by the driver.
https://docs.microsoft.com/en-us/windows/win32/direct3d12/porting-from-direct3d-11-to-direct3d-12
添加多个命令列表提交只是弄乱了我的上传缓冲区的顺序,因此无法保证命令列表的完成顺序
copy data1 to mappedPtr1
call compute shader in commandList1
execute CommandList1
copy data2 to mappedPtr1
call compute shader in commandList2
execute CommandList2
假设一个单线程应用程序。如果您调用 ExecuteCommandLists
两次(A 和 B)。 A 是否保证在启动来自 B 的任何命令之前在 GPU 上执行其所有命令?我在文档中找到的最接近的东西是这个,但它似乎并不能真正保证 A 在 B 开始之前完成:
Applications can submit command lists to any command queue from multiple threads. The runtime will perform the work of serializing these requests in the order of submission.
作为比较点,我知道这在 Vulkan 中明确不保证:
vkQueueSubmit is a queue submission command, with each batch defined by an element of pSubmits as an instance of the VkSubmitInfo structure. Batches begin execution in the order they appear in pSubmits, but may complete out of order.
但是,我不确定 DX12 是否以同样的方式工作。
The command lists are executed in order starting with the first array element
然而,在那种情况下,他说的是用两个命令列表(C 和 D 调用一次 ExecuteCommandLists
。这些操作是否与两个单独的呼叫相同?我的同事争辩说,这仍然只能保证它们按顺序启动,而不是 C 在 D 开始之前完成。
是否有我遗漏的更清晰的文档?
我在 Direct X 论坛上问过同样的问题,这里是微软工程师 Jesse Natalie 的回答:
Calling ExecuteCommandLists twice guarantees that the first workload (A) finishes before the second workload (B). Calling ExecuteCommandLists with two command lists allows the driver to merge the two command lists such that the second command list (D) may begin executing work before all work from the first (C) has finished.
Specifically, the application is allowed to insert a fence signal or wait between A and B, and the driver has no visibility into this, so the driver must ensure that everything in A is complete before the fence operation. There is no such opportunity in a single call to the API, so the driver can optimize that scenario.
Finally the ID3D12CommandQueue is a first-in first-out queue, that stores the correct order of the command lists for submission to the GPU. Only when one command list has completed execution on the GPU, will the next command list from the queue be submitted by the driver.
https://docs.microsoft.com/en-us/windows/win32/direct3d12/porting-from-direct3d-11-to-direct3d-12
添加多个命令列表提交只是弄乱了我的上传缓冲区的顺序,因此无法保证命令列表的完成顺序
copy data1 to mappedPtr1 call compute shader in commandList1 execute CommandList1 copy data2 to mappedPtr1 call compute shader in commandList2 execute CommandList2