io_uring user_data 字段始终为零

io_uring user_data field is always zero

我正在研究 io_uring、https://kernel.dk/io_uring.pdf,看看它是否可以用于异步文件 I/O 进行日志记录。这是一个简单的程序,它打开一个文件,统计文件,然后从文件中读取前 4k。当文件存在且可读时,该程序成功运行直至完成。但是完成队列条目中的 user_data 字段始终为零。 io_uring 的文档说:

user_data is common across op-codes, and is untouched by the kernel. It's simply copied to the completion event, cqe, when a completion event is posted for this request.

由于完成没有排序,因此需要 user_data 字段来匹配完成与提交。如果该字段始终为零,那么如何使用它?

#include <iostream>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <liburing.h>
#include <stdlib.h>

int main() {
  struct io_uring ring;
  // see man io_uring_setup for what this does
  auto ret = io_uring_queue_init(64, &ring, 0);

  if (ret) {
    perror("Failed initialize uring.");
    exit(1);
  }

  std::cout << "I/O uring initialized successfully. " << std::endl;

  auto directory_fd = open("/tmp", O_RDONLY);
  if (directory_fd < 0) {
    perror("Failed to open current directory.");
    exit(1);
  }

  struct io_uring_sqe *submission_queue_entry = io_uring_get_sqe(&ring);
  submission_queue_entry->user_data = 100;
  io_uring_prep_openat(submission_queue_entry, directory_fd, "stuff", O_RDONLY, 0);


  submission_queue_entry = io_uring_get_sqe(&ring);
  submission_queue_entry->user_data = 1000;
  struct statx statx_info;
  io_uring_prep_statx(submission_queue_entry, directory_fd, "stuff", 0, STATX_SIZE, &statx_info);

  //TODO: what does this actually return?
  auto submit_error = io_uring_submit(&ring);
  if (submit_error != 2) {
    std::cerr << strerror(submit_error) << std::endl;
    exit(2);
  }

  int file_fd = -1;
  uint32_t responses = 0;
  while (responses != 2) {
    struct io_uring_cqe *completion_queue_entry = 0;
    auto wait_return = io_uring_wait_cqe(&ring, &completion_queue_entry);
    if (wait_return) {
      std::cerr << "Completion queue wait error. " << std::endl;
      exit(2);
    }

    std::cout << "user data " << completion_queue_entry->user_data << " entry ptr " << completion_queue_entry << " ret " << completion_queue_entry->res << std::endl;
    std::cout << "size " << statx_info.stx_size << std::endl;
    io_uring_cqe_seen(&ring, completion_queue_entry);
    if (completion_queue_entry->res > 0) {
      file_fd = completion_queue_entry->res;
    }
    responses++;
  }


  submission_queue_entry = io_uring_get_sqe(&ring);
  submission_queue_entry->user_data = 66666;
  char buf[1024 * 4];
  io_uring_prep_read(submission_queue_entry, file_fd, buf,  1024 * 4,  0);
  io_uring_submit(&ring);
  struct io_uring_cqe* read_entry = 0;
  auto read_wait_rv = io_uring_wait_cqe(&ring, &read_entry);
  if (read_wait_rv) {
    std::cerr << "Error waiting for read to complete." << std::endl;
    exit(2);
  }
  std::cout << "Read user data " << read_entry->user_data << " completed with " << read_entry->res << std::endl;
  if (read_entry->res < 0) {
    std::cout << "Read error " << strerror(-read_entry->res) << std::endl;
  }
}

输出

I/O uring initialized successfully.
user data 0 entry ptr 0x7f4e3158c140 ret 5
size 1048576
user data 0 entry ptr 0x7f4e3158c150 ret 0
size 1048576
Read user data 0 completed with 4096

如果在调用 io_uring_prep_openat()/io_uring_prep_statx() 后尝试设置 user_data 会发生什么?

我问这个是因为在做 Google search for io_uring_prep_statx suggests it comes from liburing library.

Searching the liburing source for io_uring_prep_openat leads us to a definition of io_uring_prep_openat() in liburing.h:

static inline void io_uring_prep_openat(struct io_uring_sqe *sqe, int dfd,
                    const char *path, int flags, mode_t mode)
{
    io_uring_prep_rw(IORING_OP_OPENAT, sqe, dfd, path, mode, 0);
    sqe->open_flags = flags;
}

Searching the liburing source for io_uring_prep_statx leads to a definition of io_uring_prep_statx():

static inline void io_uring_prep_statx(struct io_uring_sqe *sqe, int dfd,
                const char *path, int flags, unsigned mask,
                struct statx *statxbuf)
{
    io_uring_prep_rw(IORING_OP_STATX, sqe, dfd, path, mask,
                (__u64) (unsigned long) statxbuf);
    sqe->statx_flags = flags;
}

追逐电话让我们到达 definition of io_uring_prep_rw:

static inline void io_uring_prep_rw(int op, struct io_uring_sqe *sqe, int fd,
                    const void *addr, unsigned len,
                    __u64 offset)
{
    sqe->opcode = op;
    sqe->flags = 0;
    sqe->ioprio = 0;
    sqe->fd = fd;
    sqe->off = offset;
    sqe->addr = (unsigned long) addr;
    sqe->len = len;
    sqe->rw_flags = 0;
    sqe->user_data = 0;
    sqe->__pad2[0] = sqe->__pad2[1] = sqe->__pad2[2] = 0;
}

PS:我注意到你有一条评论说

  //TODO: what does this actually return?
  auto submit_error = io_uring_submit(&ring);

好吧,如果我们 search the liburing repo for "int io_uring_submit" we come across the following in src/queue.c:

/*
 * Submit sqes acquired from io_uring_get_sqe() to the kernel.
 *
 * Returns number of sqes submitted
 */
int io_uring_submit(struct io_uring *ring)

这最终将调用链接到 io_uring_enter() syscall (raw man page),因此您可以阅读它以了解更多详细信息。

更新:提问者说移动作业解决了他们的问题,所以我花了一些时间思考他们引用的文本。进一步阅读后,我发现了一个微妙之处(强调):

user_data is common across op-codes, and is untouched by the kernel. It's simply copied to the completion event, cqe, when a completion event is posted for this request.

文档前面有一个类似的声明(再次强调):

The cqe contains a user_data field. This field is carried from the initial request submission, and can contain any information that the the application needs to identify said request. One common use case is to have it be the pointer of the original request. The kernel will not touch this field, it's simply carried straight from submission to completion event.

该声明适用于 io_uring 内核 系统调用,但 io_uring_prep_openat() / io_uring_prep_statx()liburing 函数。 liburing 是一个 userspace 帮助程序库,因此上面关于 user_data 的陈述不必适用于所有 liburing 函数。

If the field is always zero then how can it be used?

字段正在被某些 liburing 准备辅助函数清零。在这种情况下,只能在调用这些辅助函数后设置(并保留新值)。 io_uring 内核系统调用按照引述行事。