AF_XDP: `BPF_MAP_TYPE_XSKMAP` 只有带有 `Operation not supported` 的条目

AF_XDP: `BPF_MAP_TYPE_XSKMAP` only has entries with `Operation not supported`

这是我所有的 XDP/BPF 内核代码:

struct bpf_map_def SEC("maps") xsks_map = {
    .type = BPF_MAP_TYPE_XSKMAP,
    .key_size = sizeof(int),
    .value_size = sizeof(int),
    .max_entries = 64,  /* Assume netdev has no more than 64 queues */
};

struct bpf_map_def SEC("maps") rx_queue_pckt_counter_map = {
    .type = BPF_MAP_TYPE_ARRAY,
    .key_size = sizeof(int),
    .value_size = sizeof(unsigned long),
    .max_entries = 48,
};

SEC("xdp_sock")
int xdp_sock_prog(struct xdp_md *ctx) {

    int index = ctx->rx_queue_index;

    void *data_end = (void *)(long)ctx->data_end;
    void *data = (void *)(long)ctx->data;

    void *pos = data;
    struct ethhdr *eth = (struct ethhdr*)(pos);

    if(eth + sizeof(struct ethhdr) <= data_end) {

        if(bpf_ntohs(eth->h_proto) == ETH_P_IP) {

            struct iphdr *iph = (struct iphdr*)(pos + sizeof(struct ethhdr));

            if(iph + sizeof(struct iphdr) <= data_end) {

                if(iph->protocol == IPPROTO_UDP) {

                    const __u16 iph_sz_in_bytes = iph->ihl * 4;

                    if(iph + iph_sz_in_bytes <= data_end) {
                        struct udphdr *udh = (struct udphdr*)(pos + sizeof(struct ethhdr) + iph_sz_in_bytes);

                        if(udh + sizeof(struct udphdr) <= data_end) {

                            void *rec = bpf_map_lookup_elem(&rx_queue_pckt_counter_map, &index);
                            if(rec) {
                                long *pckt_counter_val = (long*)(rec);
                                *pckt_counter_val += 1;
                            } else {
                                return XDP_PASS;
                            }

                            if (bpf_map_lookup_elem(&xsks_map, &index)) {

                                const int ret_val = bpf_redirect_map(&xsks_map, index, 0);
                                bpf_printk("RET-VAL: %d\n", ret_val);
                                return ret_val;
                            }
                        }
                    }
                }
            }
        }
    }

    return XDP_PASS;
}

char _license[] SEC("license") = "GPL";

我正在尝试过滤所有 IP-UDP 数据包并将它们发送到用户 space。我还计算了每个 RX 队列到达的数据包数量(由 ctx->rx_queue_index 表示)。

我的程序可以正常编译,但出于某种原因,我的 user-space 程序中没有收到任何数据包。我已经在我的另一个 post 中讨论过这个:AF_XDP: No packets from multicast although steered on RX-Queue 0

我预先执行了 sudo ethtool -N eth20 flow-type udp4 action 0 以将所有数据包引导到 RX-Queue 0

我可以通过

查看当前活动的所有 bpf 映射
$ sudo bpftool map list       
32: lpm_trie  flags 0x1
        key 8B  value 8B  max_entries 1  memlock 4096B
33: lpm_trie  flags 0x1
        key 20B  value 8B  max_entries 1  memlock 4096B
34: lpm_trie  flags 0x1
        key 8B  value 8B  max_entries 1  memlock 4096B
35: lpm_trie  flags 0x1
        key 20B  value 8B  max_entries 1  memlock 4096B
36: lpm_trie  flags 0x1
        key 8B  value 8B  max_entries 1  memlock 4096B
37: lpm_trie  flags 0x1
        key 20B  value 8B  max_entries 1  memlock 4096B
125: array  name rx_queue_pckt_c  flags 0x0
        key 4B  value 8B  max_entries 48  memlock 4096B
126: xskmap  name xsks_map  flags 0x0
        key 4B  value 4B  max_entries 64  memlock 4096B

但我认为只有125126与我的程序有关。

队列引导有效,因为 sudo bpftool map dump id 125 我得到:

key: 00 00 00 00  value: 99 1a cc 04 00 00 00 00
key: 01 00 00 00  value: 00 00 00 00 00 00 00 00
key: 02 00 00 00  value: 00 00 00 00 00 00 00 00
key: 03 00 00 00  value: 00 00 00 00 00 00 00 00
key: 04 00 00 00  value: 00 00 00 00 00 00 00 00
key: 05 00 00 00  value: 00 00 00 00 00 00 00 00
key: 06 00 00 00  value: 00 00 00 00 00 00 00 00
key: 07 00 00 00  value: 00 00 00 00 00 00 00 00
key: 08 00 00 00  value: 00 00 00 00 00 00 00 00
key: 09 00 00 00  value: 00 00 00 00 00 00 00 00
key: 0a 00 00 00  value: 00 00 00 00 00 00 00 00
key: 0b 00 00 00  value: 00 00 00 00 00 00 00 00
key: 0c 00 00 00  value: 00 00 00 00 00 00 00 00
key: 0d 00 00 00  value: 00 00 00 00 00 00 00 00
key: 0e 00 00 00  value: 00 00 00 00 00 00 00 00
key: 0f 00 00 00  value: 00 00 00 00 00 00 00 00
key: 10 00 00 00  value: 00 00 00 00 00 00 00 00
key: 11 00 00 00  value: 00 00 00 00 00 00 00 00
key: 12 00 00 00  value: 00 00 00 00 00 00 00 00
key: 13 00 00 00  value: 00 00 00 00 00 00 00 00
key: 14 00 00 00  value: 00 00 00 00 00 00 00 00
key: 15 00 00 00  value: 00 00 00 00 00 00 00 00
key: 16 00 00 00  value: 00 00 00 00 00 00 00 00
key: 17 00 00 00  value: 00 00 00 00 00 00 00 00
key: 18 00 00 00  value: 00 00 00 00 00 00 00 00
key: 19 00 00 00  value: 00 00 00 00 00 00 00 00
key: 1a 00 00 00  value: 00 00 00 00 00 00 00 00
key: 1b 00 00 00  value: 00 00 00 00 00 00 00 00
key: 1c 00 00 00  value: 00 00 00 00 00 00 00 00
key: 1d 00 00 00  value: 00 00 00 00 00 00 00 00
key: 1e 00 00 00  value: 00 00 00 00 00 00 00 00
key: 1f 00 00 00  value: 00 00 00 00 00 00 00 00
key: 20 00 00 00  value: 00 00 00 00 00 00 00 00
key: 21 00 00 00  value: 00 00 00 00 00 00 00 00
key: 22 00 00 00  value: 00 00 00 00 00 00 00 00
key: 23 00 00 00  value: 00 00 00 00 00 00 00 00
key: 24 00 00 00  value: 00 00 00 00 00 00 00 00
key: 25 00 00 00  value: 00 00 00 00 00 00 00 00
key: 26 00 00 00  value: 00 00 00 00 00 00 00 00
key: 27 00 00 00  value: 00 00 00 00 00 00 00 00
key: 28 00 00 00  value: 00 00 00 00 00 00 00 00
key: 29 00 00 00  value: 00 00 00 00 00 00 00 00
key: 2a 00 00 00  value: 00 00 00 00 00 00 00 00
key: 2b 00 00 00  value: 00 00 00 00 00 00 00 00
key: 2c 00 00 00  value: 00 00 00 00 00 00 00 00
key: 2d 00 00 00  value: 00 00 00 00 00 00 00 00
key: 2e 00 00 00  value: 00 00 00 00 00 00 00 00
key: 2f 00 00 00  value: 00 00 00 00 00 00 00 00
Found 48 elements

如您所见,只有 RX-Queue 0 的计数器大于 0。

但是,如果我看一下 BPF_MAP_TYPE_XSKMAP(用于将数据包传输到用户 space),我得到:

$ sudo bpftool map dump id 126
key:
00 00 00 00
value:
Operation not supported
key:
01 00 00 00
value:
Operation not supported
key:
02 00 00 00
value:
Operation not supported
key:
03 00 00 00
value:
Operation not supported
key:
04 00 00 00
value:
Operation not supported
...
key:
3e 00 00 00
value:
Operation not supported
key:
3f 00 00 00
value:
Operation not supported
Found 0 elements

消息 Operation not supported 是否指示我在我的 user-space 程序中没有收到任何数据包的原因?还是只是无法在运行时接收值?看到Found 0 elements我也觉得奇怪。

知道这里出了什么问题吗?

这仅仅是因为 BPF_MAP_TYPE_XSKMAP 类型的映射不支持从用户 space 查找(您将从内核 space 获取地址,这对用户 space 观点,可能是安全问题)。

所以因为尝试查找 returns -EOPNOTSUPP, bpftool is unable to show the values. It could error out and print nothing, but instead we made it print the keys it finds, and print the error messages we get for the values

至于 Found 0 elements,计数是针对 bpftool 可以无误检索的元素,因此在这种情况下它保持为零是合乎逻辑的。

所以你的情况似乎没有任何问题,我不认为这个输出与你的丢失数据包问题有关。