用于通用工作负载的 DPDK

DPDK for general purpose workload

我已经在计算节点上部署了 OpenStack 并配置了 OVS-DPDK 以实现高性能网络。我的工作负载是通用工作负载,如 运行 haproxymysqlapacheXMPP

当我进行负载测试时,我发现性能一般,在 200kpps 数据包速率后我注意到数据包丢失。我听说过 DPDK 可以处理数百万个数据包,但就我而言,事实并非如此。在来宾中,我正在使用 virtio-net 来处理内核中的数据包,所以我相信我的瓶颈是我的来宾虚拟机。

我没有任何基于访客的 DPDK 应用程序,例如 testpmd 等。这是否意味着 OVS+DPDK 对我的云没有用?如何在通用工作负载中利用 OVS+DPDK?

更新

我们有自己的负载测试工具,它生成音频 RTP 流量,它是基于纯 UDP 的 150 字节数据包,并注意到在 200kpps 音频质量下降和断断续续。简而言之,DPDK 主机达到高 PMD cpu 使用率,负载测试显示音频质量不佳。当我使用基于 SRIOV 的 VM 进行相同的测试时,性能真的非常好。

$ ovs-vswitchd -V
ovs-vswitchd (Open vSwitch) 2.13.3
DPDK 19.11.7

英特尔网卡 X550T

# ethtool -i ext0
driver: ixgbe
version: 5.1.0-k
firmware-version: 0x80000d63, 18.8.9
expansion-rom-version:
bus-info: 0000:3b:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

在下面的输出中,这些 queue-id:0 到 8 是什么以及为什么只有 第一个队列正在使用中,但其他队列未使用,它们始终为零。什么 这是什么意思?

ovs-appctl dpif-netdev/pmd-rxq-show
pmd thread numa_id 0 core_id 2:
  isolated : false
  port: vhu1c3bf17a-01    queue-id:  0 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  1 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  2 (disabled)  pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  3 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 3:
  isolated : false
pmd thread numa_id 0 core_id 22:
  isolated : false
  port: vhu1c3bf17a-01    queue-id:  3 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  6 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  0 (enabled)   pmd usage: 54 %
  port: vhu6b7daba9-1a    queue-id:  5 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 23:
  isolated : false
  port: dpdk1             queue-id:  0 (enabled)   pmd usage:  3 %
pmd thread numa_id 0 core_id 26:
  isolated : false
  port: vhu1c3bf17a-01    queue-id:  2 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  7 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  1 (disabled)  pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  4 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 27:
  isolated : false
pmd thread numa_id 0 core_id 46:
  isolated : false
  port: dpdk0             queue-id:  0 (enabled)   pmd usage:  27 %
  port: vhu1c3bf17a-01    queue-id:  4 (enabled)   pmd usage:  0 %
  port: vhu1c3bf17a-01    queue-id:  5 (enabled)   pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  6 (disabled)  pmd usage:  0 %
  port: vhu6b7daba9-1a    queue-id:  7 (disabled)  pmd usage:  0 %
pmd thread numa_id 1 core_id 47:
  isolated : false


$ ovs-appctl dpif-netdev/pmd-stats-clear && sleep 10 && ovs-appctl
dpif-netdev/pmd-stats-show | grep "processing cycles:"
  processing cycles: 1697952 (0.01%)
  processing cycles: 12726856558 (74.96%)
  processing cycles: 4259431602 (19.40%)
  processing cycles: 512666 (0.00%)
  processing cycles: 6324848608 (37.81%)

处​​理周期是否意味着我的 PMD 处于压力之下?但我只是 达到 200kpps 速率?

这是我的 dpdk0 和 dpdk1 端口统计信息

sudo ovs-vsctl get Interface dpdk0 statistics
{flow_director_filter_add_errors=153605,
flow_director_filter_remove_errors=30829, mac_local_errors=0,
mac_remote_errors=0, ovs_rx_qos_drops=0, ovs_tx_failure_drops=0,
ovs_tx_invalid_hwol_drops=0, ovs_tx_mtu_exceeded_drops=0,
ovs_tx_qos_drops=0, rx_128_to_255_packets=64338613,
rx_1_to_64_packets=367, rx_256_to_511_packets=116298,
rx_512_to_1023_packets=31264, rx_65_to_127_packets=6990079,
rx_broadcast_packets=0, rx_bytes=12124930385, rx_crc_errors=0,
rx_dropped=0, rx_errors=12, rx_fcoe_crc_errors=0, rx_fcoe_dropped=12,
rx_fcoe_mbuf_allocation_errors=0, rx_fragment_errors=367,
rx_illegal_byte_errors=0, rx_jabber_errors=0, rx_length_errors=0,
rx_mac_short_packet_dropped=128, rx_management_dropped=35741,
rx_management_packets=31264, rx_mbuf_allocation_errors=0,
rx_missed_errors=0, rx_oversize_errors=0, rx_packets=71512362,
rx_priority0_dropped=0, rx_priority0_mbuf_allocation_errors=1096,
rx_priority1_dropped=0, rx_priority1_mbuf_allocation_errors=0,
rx_priority2_dropped=0, rx_priority2_mbuf_allocation_errors=0,
rx_priority3_dropped=0, rx_priority3_mbuf_allocation_errors=0,
rx_priority4_dropped=0, rx_priority4_mbuf_allocation_errors=0,
rx_priority5_dropped=0, rx_priority5_mbuf_allocation_errors=0,
rx_priority6_dropped=0, rx_priority6_mbuf_allocation_errors=0,
rx_priority7_dropped=0, rx_priority7_mbuf_allocation_errors=0,
rx_undersize_errors=6990079, tx_128_to_255_packets=64273778,
tx_1_to_64_packets=128, tx_256_to_511_packets=43670294,
tx_512_to_1023_packets=153605, tx_65_to_127_packets=881272,
tx_broadcast_packets=10, tx_bytes=25935295292, tx_dropped=0,
tx_errors=0, tx_management_packets=0, tx_multicast_packets=153,
tx_packets=109009906}

统计数据

sudo ovs-vsctl get Interface dpdk1 statistics
{flow_director_filter_add_errors=126793,
flow_director_filter_remove_errors=37969, mac_local_errors=0,
mac_remote_errors=0, ovs_rx_qos_drops=0, ovs_tx_failure_drops=0,
ovs_tx_invalid_hwol_drops=0, ovs_tx_mtu_exceeded_drops=0,
ovs_tx_qos_drops=0, rx_128_to_255_packets=64435459,
rx_1_to_64_packets=107843, rx_256_to_511_packets=230,
rx_512_to_1023_packets=13, rx_65_to_127_packets=7049788,
rx_broadcast_packets=199058, rx_bytes=12024342488, rx_crc_errors=0,
rx_dropped=0, rx_errors=11, rx_fcoe_crc_errors=0, rx_fcoe_dropped=11,
rx_fcoe_mbuf_allocation_errors=0, rx_fragment_errors=107843,
rx_illegal_byte_errors=0, rx_jabber_errors=0, rx_length_errors=0,
rx_mac_short_packet_dropped=1906, rx_management_dropped=0,
rx_management_packets=13, rx_mbuf_allocation_errors=0,
rx_missed_errors=0, rx_oversize_errors=0, rx_packets=71593333,
rx_priority0_dropped=0, rx_priority0_mbuf_allocation_errors=1131,
rx_priority1_dropped=0, rx_priority1_mbuf_allocation_errors=0,
rx_priority2_dropped=0, rx_priority2_mbuf_allocation_errors=0,
rx_priority3_dropped=0, rx_priority3_mbuf_allocation_errors=0,
rx_priority4_dropped=0, rx_priority4_mbuf_allocation_errors=0,
rx_priority5_dropped=0, rx_priority5_mbuf_allocation_errors=0,
rx_priority6_dropped=0, rx_priority6_mbuf_allocation_errors=0,
rx_priority7_dropped=0, rx_priority7_mbuf_allocation_errors=0,
rx_undersize_errors=7049788, tx_128_to_255_packets=102664472,
tx_1_to_64_packets=1906, tx_256_to_511_packets=68008814,
tx_512_to_1023_packets=126793, tx_65_to_127_packets=1412435,
tx_broadcast_packets=1464, tx_bytes=40693963125, tx_dropped=0,
tx_errors=0, tx_management_packets=199058, tx_multicast_packets=146,
tx_packets=172252389}

更新 - 2

dpdk 接口

  # dpdk-devbind.py -s
    
    Network devices using DPDK-compatible driver
    ============================================
    0000:3b:00.1 'Ethernet Controller 10G X550T 1563' drv=vfio-pci unused=ixgbe
    0000:af:00.1 'Ethernet Controller 10G X550T 1563' drv=vfio-pci unused=ixgbe
    
    Network devices using kernel driver
    ===================================
    0000:04:00.0 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if=eno1 drv=tg3 unused=vfio-pci
    0000:04:00.1 'NetXtreme BCM5720 2-port Gigabit Ethernet PCIe 165f' if=eno2 drv=tg3 unused=vfio-pci
    0000:3b:00.0 'Ethernet Controller 10G X550T 1563' if=int0 drv=ixgbe unused=vfio-pci
    0000:af:00.0 'Ethernet Controller 10G X550T 1563' if=int1 drv=ixgbe unused=vfio-pci

OVS

# ovs-vsctl show
595103ef-55a1-4f71-b299-a14942965e75
    Manager "ptcp:6640:127.0.0.1"
        is_connected: true
    Bridge br-tun
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port br-tun
            Interface br-tun
                type: internal
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port vxlan-0a48042b
            Interface vxlan-0a48042b
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.43"}
        Port vxlan-0a480429
            Interface vxlan-0a480429
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.41"}
        Port vxlan-0a48041f
            Interface vxlan-0a48041f
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.31"}
        Port vxlan-0a48042a
            Interface vxlan-0a48042a
                type: vxlan
                options: {df_default="true", egress_pkt_mark="0", in_key=flow, local_ip="10.72.4.44", out_key=flow, remote_ip="10.72.4.42"}
    Bridge br-vlan
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port br-vlan
            Interface br-vlan
                type: internal
        Port dpdkbond
            Interface dpdk1
                type: dpdk
                options: {dpdk-devargs="0000:af:00.1", n_txq_desc="2048"}
            Interface dpdk0
                type: dpdk
                options: {dpdk-devargs="0000:3b:00.1", n_txq_desc="2048"}
        Port phy-br-vlan
            Interface phy-br-vlan
                type: patch
                options: {peer=int-br-vlan}
    Bridge br-int
        Controller "tcp:127.0.0.1:6633"
            is_connected: true
        fail_mode: secure
        datapath_type: netdev
        Port vhu87cf49d2-5b
            tag: 7
            Interface vhu87cf49d2-5b
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhu87cf49d2-5b"}
        Port vhub607c1fa-ec
            tag: 7
            Interface vhub607c1fa-ec
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhub607c1fa-ec"}
        Port vhu9a035444-83
            tag: 8
            Interface vhu9a035444-83
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhu9a035444-83"}
        Port br-int
            Interface br-int
                type: internal
        Port int-br-vlan
            Interface int-br-vlan
                type: patch
                options: {peer=phy-br-vlan}
        Port vhue00471df-d8
            tag: 8
            Interface vhue00471df-d8
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhue00471df-d8"}
        Port vhu683fdd35-91
            tag: 7
            Interface vhu683fdd35-91
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhu683fdd35-91"}
        Port vhuf04fb2ec-ec
            tag: 8
            Interface vhuf04fb2ec-ec
                type: dpdkvhostuserclient
                options: {vhost-server-path="/var/lib/vhost_socket/vhuf04fb2ec-ec"}
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
    ovs_version: "2.13.3"

我已经使用 openstack 创建了来宾虚拟机,他们可以看到它们是使用虚拟主机套接字连接的(例如:/var/lib/vhost_socket/vhuf04fb2ec-ec)

当我进行负载测试时,我发现性能一般,在 200kpps 数据包速率后我注意到数据包丢失。简而言之,DPDK 主机达到高 PMD cpu 使用率,负载测试显示音频质量不佳。当我用 SRI

做同样的测试时

[Answer] 根据目前进行的实时调试,此观察结果不正确。原因如下

  1. 启动的 qemu 未固定到特定内核。
  2. 针对 vhost-client 对 PCIe 直通 (VF) 进行的比较不是对等 比较。
  3. 使用 OpenStack 方法,在数据包到达 VM 之前至少要经过 3 个网桥。
  4. OVS 线程未固定,这导致所有 PMD 线程 运行在每个桥接阶段都在同一核心上(导致延迟和丢弃)。

为了与 SRIOV 方法进行公平比较,对

进行了以下更改
  External Port <==> DPDK Port0 (L2fwd) DPDK net_vhost <--> QEMU (virtio-pci)

使用 iperf3(双向)实现的数字约为 10Gbps。

注意:请求 运行 trex、pktgen 试用 Mpps。预期在当前设置下达到最低 8 MPPS。

因此这不是 DPDK、virtio-client、qemu-kvm 或 SRIOV 相关问题,而是配置或平台设置问题。