为什么 ioctl 命令报告 "KVM doesn't support IOMMU"?

Why does ioctl command report "KVM doesn't support IOMMU"?

我的 Linux 已经编译了 IOMMU 特性 ON,从 dmesg 日志,我可以确定:

[root@dhb5 vm]# dmesg | grep -e DMAR -e IOMMU
[    0.000000] ACPI: DMAR 000000007bfbb000 00368 (v01 HP     03010201 00000002 HPAG 00020000)
[    0.000000] DMAR: IOMMU enabled
[    0.512361] DMAR: Host address width 44
[    0.516675] DMAR: DRHD base: 0x00000093ff8000 flags: 0x0
[    0.522664] DMAR: IOMMU 0: reg_base_addr 93ff8000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.531762] DMAR: DRHD base: 0x00000097ff8000 flags: 0x0
[    0.537744] DMAR: IOMMU 1: reg_base_addr 97ff8000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.546837] DMAR: DRHD base: 0x0000009bff8000 flags: 0x0
[    0.552829] DMAR: IOMMU 2: reg_base_addr 9bff8000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.561922] DMAR: DRHD base: 0x0000009fff8000 flags: 0x0
[    0.567904] DMAR: IOMMU 3: reg_base_addr 9fff8000 ver 1:0 cap d2078c106f0466 ecap f020de
[    0.576994] DMAR: RMRR base: 0x00000079911000 end: 0x00000079913fff
[    0.584038] DMAR: RMRR base: 0x0000007990e000 end: 0x00000079910fff
[    0.591079] DMAR: ATSR flags: 0x0
[    0.594805] DMAR: ATSR flags: 0x0
[    0.598530] DMAR: ATSR flags: 0x0
[    0.602255] DMAR: ATSR flags: 0x0
[    0.605982] DMAR: RHSA base: 0x00000093ff8000 proximity domain: 0x0
[    0.613024] DMAR: RHSA base: 0x00000097ff8000 proximity domain: 0x1
[    0.620067] DMAR: RHSA base: 0x0000009bff8000 proximity domain: 0x2
[    0.627110] DMAR: RHSA base: 0x0000009fff8000 proximity domain: 0x3
[    0.634163] DMAR-IR: IOAPIC id 13 under DRHD base  0x9fff8000 IOMMU 3
[    0.641411] DMAR-IR: IOAPIC id 11 under DRHD base  0x9bff8000 IOMMU 2
[    0.648647] DMAR-IR: IOAPIC id 12 under DRHD base  0x9bff8000 IOMMU 2
[    0.655887] DMAR-IR: IOAPIC id 10 under DRHD base  0x97ff8000 IOMMU 1
[    0.663126] DMAR-IR: IOAPIC id 8 under DRHD base  0x93ff8000 IOMMU 0
[    0.670264] DMAR-IR: IOAPIC id 9 under DRHD base  0x93ff8000 IOMMU 0
[    0.677404] DMAR-IR: HPET id 0 under DRHD base 0x93ff8000
[    0.683483] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[    0.696950] DMAR-IR: Enabled IRQ remapping in x2apic mode
[    5.505856] DMAR: dmar3: Using Queued invalidation
[    5.511765] DMAR: dmar2: Using Queued invalidation
[    5.517667] DMAR: dmar1: Using Queued invalidation
[    5.523223] DMAR: dmar0: Using Queued invalidation
[    5.528914] DMAR: Setting RMRR:
[    5.532484] DMAR: Setting identity map for device 0000:20:1d.0 [0x7990e000 - 0x79910fff]
[    5.541659] DMAR: Setting identity map for device 0000:00:1d.0 [0x79911000 - 0x79913fff]
[    5.550776] DMAR: Prepare 0-16MiB unity mapping for LPC
[    5.556672] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
[    5.564908] DMAR: PCI-DMA: Intel(R) Virtualization Technology for Directed I/O

但是执行下面的测试代码:

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <unistd.h>
#include <linux/kvm.h>

#define KVM_FILE "/dev/kvm"

int main(void)
{
        int dev;
        int ret;

        dev = open(KVM_FILE,O_RDWR|O_NDELAY);
        ret = ioctl(dev,KVM_CHECK_EXTENSION,KVM_CAP_IOMMU);
        if(ret != 0)
                printf("----KVM supports IOMMU (i.e. Intel VT-d or AMD IOMMU).----\n");
        else
                printf("----KVM doesn't support IOMMU (i.e. Intel VT-d or AMD IOMMU).----\n");

        return 0;
}

输出为:

----KVM doesn't support IOMMU (i.e. Intel VT-d or AMD IOMMU).----

我有点困惑,因为从 dmesg 日志,内核报告它支持 IOMMU,为什么 ioctl 命令假设 kernel 不支持IOMMU?

因为您没有正确检查 ioctl return 值。 man ioctl 应该能帮到你。

Usually, on success zero is returned. A few ioctl() requests use the return value as an output parameter and return a nonnegative value on success. On error, -1 is returned, and errno is set appropriately

如 nos 所述,在 KVM_CHECK_EXTENSION ioctl 的特定情况下,如果支持上限,则 return 为正值,如果不支持,则为 0,如果发生错误,则为 -1。

所以代码应该是:

if(ret > 0)
    printf("----KVM supports IOMMU (i.e. Intel VT-d or AMD IOMMU).----\n");
else
    printf("----KVM doesn't support IOMMU (i.e. Intel VT-d or AMD IOMMU).----\n");

您将在内核文档(第 4.4 节)中找到更多详细信息:https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt

根本原因已找到,需要配置KVM_DEVICE_ASSIGNMENT选项来构建内核。因为在 kvm_vm_ioctl_check_extension 函数中:

#ifdef CONFIG_KVM_DEVICE_ASSIGNMENT
    case KVM_CAP_IOMMU:
        r = iommu_present(&pci_bus_type);
        break;
#endif

没有这个选项,函数returns直接0
请参考这个 discussion.