RDMA读写数据placement/visibility语义

RDMA read and write data placement/visibility semantics

我正在尝试获取有关 RDMA 读写语义（尤其是数据放置语义）的更多详细信息，我想在这里与专家确认我的理解。

RDMA 读取：

一旦在完成队列中看到 RDMA 读取完成，数据是否会 available/seen 在本地缓冲区中。如果我使用 GPU Direct DMA 并且本地地址映射到 GPU 内存，行为是否相同。一旦在完成队列中看到 RDMA READ 完成，数据是否会立即在 GPU 中可用。如果不是立即可用，什么操作可以确保它。

RDMA 立即写入（或）RDMA 写入 + 发送：

在看到接收队列中的即时数据后，远程主机能否检查其内存中是否存在数据。如果写入 GPU 内存（使用 GDR），expectation/behavior 是否会发生变化。

RDMA read. Would the data be available/seen in the local buffer, once the RDMA read completion is seen in the completion queue?

是

Is the behavior the same, if I am using GPU Direct DMA and the local address maps to GPU memory?

不一定。有可能 NIC 已将数据发送到 GPU，但 GPU 尚未收到。同时 RDMA 读取完成已经到达 CPU。其根本原因是 PCIe 语义，它允许重新排序写入不同的目的地（CPU/GPU 内存）。

If it is not immediately available, what operation will make ensure it?

为确保数据已到达 GPU，可以在 RDMA 完成后在 CPU 上设置一个标志，并从 GPU 代码中轮询此标志。这是可行的，因为 GPU 发出的 PCIe 读取将“推送”NIC 的 DMA 写入（根据 PCIe 排序语义）。

RDMA Write with Immediate (or) RDMA Write + Send: Can the remote host check for presence of data in its memory, after it has seen the Immediate data in receive queue. And is the expectation/behavior going to change, if the Write is to GPU memory (using GDR).

是的，这行得通，但 GDR 遇到与上述相同的问题，与 CPU 内存相比，写入 GPU 内存的顺序乱序，同样是由于 PCIe 排序语义。 RNIC 无法控制 PCIe，因此在任何一种情况下都无法强制执行“所需”语义。

RDMA读写数据placement/visibility语义

RDMA read and write data placement/visibility semantics

rdma

gpudirect