"Synchronizing" Vulkan Acquire-Present 场景中带有信号量的渲染通道布局转换

"Synchronizing" a render pass layout transition with a semaphore in Acquire-Present scenario in Vulkan

于是就有了这个官方例子https://github.com/KhronosGroup/Vulkan-Docs/wiki/Synchronization-Examples#combined-graphicspresent-queue:

/* Only need a dependency coming in to ensure that the first
   layout transition happens at the right time.
   Second external dependency is implied by having a different
   finalLayout and subpass layout. */
VkSubpassDependency dependency = {
    .srcSubpass = VK_SUBPASS_EXTERNAL,
    .dstSubpass = 0,
    // .srcStageMask needs to be a part of pWaitDstStageMask in the WSI semaphore.
    .srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    .dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    .srcAccessMask = 0,
    .dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,
    .dependencyFlags = 0};

有人可以向我提供规范中的相关部分,结合保证(构成推理链)这种依赖布局转换将在队列的等待信号量(获取的图像)发出信号之前发生吗?

特别是我找不到如何解释这个 "dependency from that same stage to itself"。

说清楚。我发现很多地方似乎与这里相关。我已经阅读文档一个多月了,但我正在努力寻找其中的连贯性。

例如,何时(根据规范)可用性操作确实发生了?何时提交相关内存依赖操作(如提交顺序)?如果是,那么提交 subpass 依赖项时?或者它是在源范围指令和目标范围指令之间的某个地方(如 If srcSubpass is equal to VK_SUBPASS_EXTERNAL, the first synchronization scope includes commands that occur earlier in submission order than the vkCmdBeginRenderPass)。如果是,上面示例中的 srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT 指的是什么指令?

在 krOoze 回答后进行编辑

我想我会写在这里。一是评论太长,二是我相信它可能对其他人有用。

我承认,我误解了规范中关于执行依赖链的部分。

总结一下。要根据规范定义相关机制,我们有以下内容:

  1. waiting on semaphore操作happens-beforesubpass dependency operation(这里我其实有点麻烦) :

    6.4.2. Semaphore Waiting*
    The semaphore wait operation happens-after the first set of operations in the execution dependency, and happens-before the second set of operations in the execution dependency.

    但是如何确定我们的subpass依赖操作在第二组呢?它在同一批次中,它没有定义关于 subpass 依赖的提交顺序(至少我看不到)并且信号量第二个同步范围的定义没有帮助,因为我们的 subpass 依赖没有发生在 VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT 管道阶段(在 vkQueueSubmit 的情况下,这是第二个同步范围的限制)。更重要的是,同步范围并没有定义第二组操作。这是一个独特的术语。但是我发现了一个可能对这里有帮助的声明(好吧,如果我们同意子通道依赖是工作项的一部分):

    4.3.5. Queue Submission
    Each batch consists of three distinct parts:

    1. Zero or more semaphores to wait on before execution of the rest of the batch.
    2. Zero or more work items to execute.
    3. Zero or more semaphores to signal upon completion of the work items.

    我们需要确定这个顺序来构建执行依赖链:

  2. 等待信号量和subpass依赖构成执行依赖链根据:

    6.1. Execution and Memory Dependencies
    An execution dependency chain is a sequence of execution dependencies that form a happens-before relation between the first dependency’s A' and the final dependency’s B'. For each consecutive pair of execution dependencies, a chain exists if the intersection of BS in the first dependency and AS in the second dependency is not an empty set.

    (详见 krOoze 的回答)

    从这里我们知道我们的子通道依赖的目标范围将发生在信号量信号之后(信号量操作在信号量等待操作的源范围内)。
    现在我们应该对布局转换规则没问题了:

  3. 布局转换发生在我们的 subpass 依赖的可用性操作之后:

    7.1. Render Pass Creation
    Automatic layout transitions away from initialLayout happens-after the availability operations for all dependencies with a srcSubpass equal to VK_SUBPASS_EXTERNAL, where dstSubpass uses the attachment that will be transitioned.

    老实说,我仍然缺少规范中发出信号量和可用性操作部分之间的顺序,但我认为可以假设。
    (以上可行,因为可用性操作是内存依赖操作的一部分:

    An operation that performs a memory dependency generates:
    • An availability operation with source scope of all writes in the first access scope of the dependency and a destination scope of the device domain.

    好吧,我们的第一个访问范围是空的,但它仍然是一个可用性操作,对吧?)

还有这样的说法:

For attachments however, subpass dependencies work more like a VkImageMemoryBarrier defined similarly to the VkMemoryBarrier above, the queue family indices set to VK_QUEUE_FAMILY_IGNORED, and layouts as follows:
• The equivalent to oldLayout is the attachment’s layout according to the subpass description for srcSubpass.
• The equivalent to newLayout is the attachment’s layout according to the subpass description for dstSubpass.

...这带来了另一个分析范围,但我已经头疼了。当我对上述想法有一些评论时,我会非常乐意对此进行更多编辑。

*所有规格引用来自 "Vulkan® 1.2.132 - A Specification (with all registered Vulkan extensions)"

我在 krOoze/Hello_Triangle/doc 稍微回顾了一下。在这种情况下应该发生的是:

Particularly I can't find how to interpret this "dependency from that same stage to itself".

现在,让我们先解决这个问题。这就是我喜欢称呼 cart-before-horse intuition of the synchronization system.

不是“同步阶段”或类似的东西。这种直觉只会让你感到困惑。您同步作用域。

人们还会将管道与流程图混淆。有巨大的直觉差异。在流程图中,你从头开始,然后按顺序遍历所有阶段,然后你就完成了,永远完成了。那是不是管道是什么。它永远不会开始,也永远不会结束。管道就是。它就像一个桌面游戏板。您通过管道填充命令,它们 像板上的钉子一样经历各个阶段。

同步命令是在两个事物之间引入依赖关系的东西:源同步范围目标同步范围之间。它保证 src 范围发生在 dst 范围之前。

A scope 是队列操作的一些子集,以及它们当前可以在哪个阶段执行。

所以,有了这个更好的直觉,

    .srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
    .dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,

是一件非常正常的事情。这意味着源范围内的命令(对于屏障,较早记录的命令,或更正式地说,那些“提交顺序较早的命令”)在目标范围内的任何命令到达 COLOR_ATTACHMENT 之前到达 COLOR_ATTACHMENT 阶段阶段。 (相比之下,没有依赖性意味着任何命令都可以在任何给定时间执行的任何阶段。

For example when (according to the specification) an availability operation does happen.

这些在某种程度上被插入到屏障定义的依赖项中。假设您在屏障中包含 内存依赖性

可用性操作(如果有)发生在源同步作用域之后。然后发生布局转换(如果有的话)。然后发生 visibility op(如果有的话)。只有在那之后目标同步作用域才能执行。

Could someone please provide me with relevant sections in the specification that combined guarantee (constitute a chain of reasoning)

我现在只想拍拍你的头,因为你想要权威的信息...:D

因此,您需要了解形式主义和术语。这是描述所有同步原语的东西。它只有一页,但相对难以阅读。我试着解释上面的重要部分。我不会在这里引用它,它是 6.1. Execution and Memory Dependencies 章节。

现在,信号量等待有了自己的章节。重要的是要注意它对其他命令的行为与 vkQueueSubmit 的行为略有不同(这很烦人)。无论如何(6.4.2. Semaphore Waiting):

The second synchronization scope includes every command submitted in the same batch. In the case of vkQueueSubmit, the second synchronization scope is limited to operations on the pipeline stages determined by the destination stage mask specified by the corresponding element of pWaitDstStageMask. Also, in the case of vkQueueSubmit, the second synchronization scope additionally includes all commands that occur later in submission order.

The second access scope includes all memory access performed by the device.

批处理(对于vkQueueSubmit)是单个VkSubmitInfo提交顺序也有自己的章节;基本上它意味着“提交数组中稍后的所有其他批次,以及同一队列中的任何未来 vkQueueSubmit”。

因此,这意味着:“如果您等待信号量,VkSubmitInfo 中的所有命令只有在发出信号量后才能到达 pWaitDstStageMask 阶段”。

现在了解 Render Pass 的作用很重要。除了记录命令外,它还有其他“可同步”:自动布局转换、加载操作和存储操作。

自动布局转换:

Automatic layout transitions away from initialLayout happens-after the availability operations for all dependencies with a srcSubpass equal to VK_SUBPASS_EXTERNAL, where dstSubpass uses the attachment that will be transitioned

Automatic layout transitions into the layout used in a subpass happen-before the visibility operations for all dependencies with that subpass as the dstSubpass.

所以简单来说,布局转换是在您列出的 VkSubpassDependency 定义的依赖项中潜入的。它发生在 .srcStageMask.srcAccessMask 之后。它发生在 .dstSubpass.dstStageMask 之前 .dstAccessMask.

加载操作:

The load operation for each sample in an attachment happens-before any recorded command which accesses the sample in the first subpass where the attachment is used. [...] Load operations for attachments with a color format execute in the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT pipeline stage.

VK_ATTACHMENT_LOAD_OP_LOAD [...] For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_READ_BIT.

VK_ATTACHMENT_LOAD_OP_CLEAR(or VK_ATTACHMENT_LOAD_OP_DONT_CARE) [...] For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.

加载操作作为使用附件(您的 .dstSubpass)的第一个子通道的一部分发生。以上明确地确定了你的 .dstStageMask.dstAccessMask)

现在,轮到我们选择pWaitDstStageMask.srcStageMask.srcAccessMask了。你列出来是 pWaitDstStageMask = COLOR_ATTACHMENT_OUTPUT, .srcStageMask = COLOR_ATTACHMENT_OUTPUT.srcAccessMask = 0.

信号量等待操作必须在 VkSubpassDependency 之前发生。这被指定为 依赖链:

An execution dependency chain is a sequence of execution dependencies that form a happens-before relation between the first dependency’s A' and the final dependency’s B'. For each consecutive pair of execution dependencies, a chain exists if the intersection of BS in the first dependency and AS in the second dependency is not an empty set.

即两个后续的同步原语也相互同步并形成过渡属性。我们这里的A'是信号量信号,我们这里的B'VkSubpassDependency的dst作用域。我们这里的BS是信号量dst范围,即pWaitDstStageMask。而我们的AS就是我们的VkSubpassDependency.

的src作用域

所以我们的 pWaitDstStageMask.srcStageMask 的交集仍然是 COLOR_ATTACHMENT_OUTPUT。因此形成了一个依赖链,保证信号量信号发生在渲染通道的 0 子通道中的命令的 COLOR_ATTACHMENT_OUTPUT 之前。

现在,将它们放在一起:来自 vkAcquireNextImage 的信号量信号使交换链图像 可从 表示引擎的读取中获得。 vkQueueSubmit 中的信号量等待使交换链映像 批处理中限制为 COLOR_ATTACHMENT_OUTPUT 的所有命令可见。 VkSubpassDependency 链接到那个信号量等待。该图像仍然 可见,因此不需要额外的内存依赖,因此我们的 .srcAccessMask0。布局转换写入图像并使其(隐含地)可从布局转换和可见,无论.dst*提供给VkSubpassDependency.

引用自“Vulkan® 1.2.169 - 规范(包含所有已注册的 Vulkan 扩展)”的所有规范。

问:

Particularly I can't find how to interpret this "dependency from that same stage to itself".

甲:

7.4.2. Semaphore Waiting 中,在注释部分提供了示例。

If an image layout transition needs to be performed on a presentable image before it is used in a framebuffer, that can be performed as the first operation submitted to the queue after acquiring the image, and should not prevent other work from overlapping with the presentation operation. For example, a VkImageMemoryBarrier could use:

  • srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
  • srcAccessMask = 0
  • dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
  • dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.
  • oldLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR
  • newLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL

因为 subpass 依赖关系像 VkImageMemoryBarrier according to the note section with respect to VkSubpassDependency,

For non-attachment resources, the memory dependency expressed by subpass dependency is nearly identical to that of a VkMemoryBarrier ...

For attachments however, subpass dependencies work more like a VkImageMemoryBarrier defined similarly to the VkMemoryBarrier above ...

关于示例的进一步说明可以回答部分问题。

This barrier accomplishes a dependency chain between previous presentation operations and subsequent color attachment output operations, with the layout transition performed in between, and does not introduce a dependency between previous work and any vertex processing stages. More precisely, the semaphore signals after the presentation operation completes, the semaphore wait stalls the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT stage, and there is a dependency from that same stage to itself with the layout transition performed in between.