DPDK 示例应用程序在 EAL 后中止:无法在大页面文件上获取 fd

DPDK sample application aborts after EAL: Couldn't get fd on hugepage file

克隆 dpdk git 存储库并构建 helloworld 应用程序后,出现以下错误:

$ ./examples/helloworld/build/helloworld
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: Couldn't get fd on hugepage file
EAL: error allocating rte services array
EAL: FATAL: rte_service_init() failed
EAL: rte_service_init() failed
PANIC in main():
Cannot init EAL
5: [./examples/helloworld/build/helloworld(+0x11de) [0x56175faac1de]]
4: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f31f60fe0b3]]
3: [./examples/helloworld/build/helloworld(+0x111c) [0x56175faac11c]]
2: [/lib/x86_64-linux-gnu/librte_eal.so.20.0(__rte_panic+0xc5) [0x7f31f62d537e]]
1: [/lib/x86_64-linux-gnu/librte_eal.so.20.0(rte_dump_stack+0x32) [0x7f31f62ecc52]]
Aborted (core dumped)

检查了大页面支持,似乎没问题:

$ cat /proc/meminfo | grep -i huge
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:     256
HugePages_Free:      255
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:          524288 kB

$ mount | grep huge
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)

$ cat /proc/filesystems | grep huge
nodev   hugetlbfs

$ cat /proc/sys/vm/nr_hugepages
256

我在 related question 中看到了解决方法; 运行 它带有 --no-huge 选项,有效:

$ ./examples/helloworld/build/helloworld --no-huge
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem
EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:02:01.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:100f net_e1000_em
EAL: PCI device 0000:03:00.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15ad:7b0 net_vmxnet3
hello from core 1
hello from core 2
hello from core 3
hello from core 0

但这是一个有限的解决方案。

TL;DR 使用sudo

运行 --log-level=eal,8 根据@VipinVarghese 的建议显示这是一个权限问题:

$ ./examples/helloworld/build/helloworld --log-level=eal,8
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 0 on socket 0
EAL: Detected lcore 3 as core 0 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: open shared lib /usr/lib/x86_64-linux-gnu/dpdk/pmds-20.0/librte_pmd_qede.so
EAL: Registered [vdev] bus.
EAL: Registered [pci] bus.
EAL: Registered [eth] device class.
EAL: open shared lib /usr/lib/x86_64-linux-gnu/dpdk/pmds-20.0/librte_pmd_aesni_mb.so
...
EAL: Ask a virtual area of 0x61000 bytes
EAL: Virtual area found at 0xd00600000 (size = 0x61000)
EAL: Memseg list allocated: 0x800kB at socket 0
EAL: Ask a virtual area of 0x400000000 bytes
EAL: Virtual area found at 0xd00800000 (size = 0x400000000)
EAL: TSC frequency is ~2590000 KHz
EAL: Master lcore 0 is ready (tid=7fc11ed01d00;cpuset=[0])
EAL: lcore 2 is ready (tid=7fc116ffd700;cpuset=[2])
EAL: lcore 3 is ready (tid=7fc1167fc700;cpuset=[3])
EAL: lcore 1 is ready (tid=7fc1177fe700;cpuset=[1])
EAL: Trying to obtain current memory policy.
EAL: Setting policy MPOL_PREFERRED for socket 0
EAL: get_seg_fd(): open failed: Permission denied
EAL: Couldn't get fd on hugepage file
EAL: attempted to allocate 1 segments, but only 0 were allocated
EAL: Restoring previous memory policy: 0
EAL: error allocating rte services array
EAL: FATAL: rte_service_init() failed
EAL: rte_service_init() failed
PANIC in main():
Cannot init EAL
5: [./examples/helloworld/build/helloworld(+0x11de) [0x56459e5391de]]
4: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7fc11eed00b3]]
3: [./examples/helloworld/build/helloworld(+0x111c) [0x56459e53911c]]
2: [/lib/x86_64-linux-gnu/librte_eal.so.20.0(__rte_panic+0xc5) [0x7fc11f0a737e]]
1: [/lib/x86_64-linux-gnu/librte_eal.so.20.0(rte_dump_stack+0x32) [0x7fc11f0bec52]]
Aborted (core dumped)

尝试解决权限问题(EAL:get_seg_fd():打开失败:权限被拒绝),但它只在我运行它时有效作为根用户:

$ sudo ./examples/helloworld/build/helloworld
EAL: Detected 4 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:02:01.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 8086:100f net_e1000_em
EAL: PCI device 0000:03:00.0 on NUMA socket -1
EAL:   Invalid NUMA socket, default to 0
EAL:   probe driver: 15ad:7b0 net_vmxnet3
hello from core 1
hello from core 2
hello from core 3
hello from core 0

事实证明,这是正确的方法,尽管 documentation 似乎认为这是显而易见的。 “6.2.运行 示例应用程序” 部分没有提及所需的 root 权限,摘录如下:

Copy the DPDK application binary to your target, then run the application as follows (assuming the platform has four memory channels per processor socket, and that cores 0-3 are present and are to be used for running the application):

./dpdk-helloworld -l 0-3 -n 4

不过,这一点在后面的文档中有提到,见“8.2. Running DPDK Applications Without Root Privileges”那里有明确的注释:

The instructions below will allow running DPDK as non-root with older Linux kernel versions. However, since version 4.0, the kernel does not allow unprivileged processes to read the physical address information from the pagemaps file, making it impossible for those processes to use HW devices which require physical addresses

FAQ中也有提示:

  1. What does “EAL: map_all_hugepages(): open failed: Permission denied Cannot init memory” mean? This is most likely due to the test application not being run with sudo to promote the user to a superuser. Alternatively, applications can also be run as regular user. For more information, please refer to DPDK Getting Started Guide.

还有一个 email 触及这个话题:

2016-07-07 16:47, Jez Higgins:
> Is it possible to get DPDK up and running as non-root - if so, can
> anyone guide me to what I'm missing? Or should I be giving this up as a
> bad job?

You can try the --no-huge option.
But most of drivers won't work without hugepage currently.
A rework of the memory allocation is needed to make it work better.

那是四年前的事了。也许已经有不需要 sudo--no-huge 的解决方案?如果是这样,欢迎其他答案。现在,我要这样做。

@Nagev 我请你检查dpdk as non root stack overflow question in Nov 2020

[EDIT-1] 注意到上面的问题已被删除,因此对细节的访问受到限制,用答案更新 how to run without sudo or root privellege section

注意:我一直在 运行 DPDK 应用程序作为非根用户使用 18.11.5 LTS 和 19.11.3 LTS