VM 中的 Hugepagesize 没有增加到 1G

Hugepagesize is not increasing to 1G in VM

我在 ESXi 服务器中使用 CentOS 虚拟机。我想将大页面大小增加到 1G。

我关注了link: http://dpdk-guide.gitlab.io/dpdk-guide/setup/hugepages.html

我执行了小脚本来检查是否支持 1 GB 的大小:

[root@localhost ~]# if grep pdpe1gb /proc/cpuinfo >/dev/null 2>&1; then echo "1GB supported."; fi
1GB supported.
[root@localhost ~]# 
  1. 我将 default_hugepagesz=1GB hugepagesz=1G hugepages=4 添加到 /etc/default/grub。
  2. grub2-mkconfig -o /boot/grub2/grub.cfg
  3. 重新启动虚拟机。

但我仍然可以看到 2048 KB (2MB) 的超大页面大小。

[root@localhost ~]# cat /proc/meminfo | grep -i huge
AnonHugePages:      8192 kB
HugePages_Total:    1024
HugePages_Free:     1024
HugePages_Rsvd:        0
HugePages_Surp:        0
**Hugepagesize:       2048 kB**
[root@localhost ~]# 

VM详情如下:

[root@localhost ~]# uname -a
Linux localhost.localdomain 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost ~]#

[root@localhost ~]# cat /proc/cpuinfo  | grep -i flags
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi ept vpid
flags       : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt aes xsave avx hypervisor lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi ept vpid
[root@localhost ~]# 

8GB 内存和 2 个 CPU 分配给 VM。

CPU 1gb 大页面支持标志和 guest OS support/enabling 不足以让 1gb 大页面在虚拟化环境中工作。

在 PMD(PAE 和 x86_64 之前为 2MB 或 4 MB)和 PUD 级别(1 GB)上的大页面的想法是创建从对齐的大尺寸虚拟区域到某个大区域的映射物理内存(据我所知,它也应该对齐)。随着管理程序的额外虚拟化级别,现在有三个(或四个)内存级别:来宾中应用程序的虚拟内存 OS,一些被来宾视为物理内存 OS(它是由虚拟化解决方案:ESXi、Xen、KVM、....),以及真实的物理内存。可以合理地假设大页面的想法应该在所有三个级别中具有相同大小的大区域才有用(产生更少的 TLB 未命中,使用更少的页面 Table 结构来描述大量内存 - grep "Need bigger than 4KB pages"在 DickSites's "Datacenter Computers: modern challenges in CPU design", Google, Feb2015).

因此,要在 Guest OS 中使用某个级别的大页面,您应该已经在物理内存(在您的主机 OS 中)和您的虚拟化解决方案中拥有相同大小的大页面。 当您的主机 OS 和虚拟化软件 不可用时,您无法在 Guest 中有效地使用大页面。 (有些像 qemu 或 bochs 可能会模拟它们,但这会从慢到非常慢。)当您同时需要 2 MB 和 1 GB 大页面时:您的 CPU、主机 OS、虚拟系统和Guest OS 都应该支持它们(并且主机系统应该有足够对齐的连续物理内存来分配 1 GB 页面,您可能无法在 NUMA 中将此页面拆分为多个套接字)。

不知道 ESXi,但有一些链接

Procedure 8.2. Allocating 1 GB huge pages at boot time

  1. To allocate different sizes of huge pages at boot, use the following command, specifying the number of huge pages. This example allocates 4 1 GB huge pages and 1024 2 MB huge pages: 'default_hugepagesz=1G hugepagesz=1G hugepages=4 hugepagesz=2M hugepages=1024' Change this command line to specify a different number of huge pages to be allocated at boot.

Note The next two steps must also be completed the first time you allocate 1 GB huge pages at boot time.

  1. Mount the 2 MB and 1 GB huge pages on the host:

    # mkdir /dev/hugepages1G # mount -t hugetlbfs -o pagesize=1G none /dev/hugepages1G # mkdir /dev/hugepages2M # mount -t hugetlbfs -o pagesize=2M none /dev/hugepages2M

  2. Restart libvirtd to enable the use of 1 GB huge pages on guests:

    # service restart libvirtd

1 GB huge pages are now available for guests.

By increasing the page size, you reduce the page table and reduce the pressure on the TLB cache. ... vm.nr_hugepages = 256 ... Reboot the system (note: this is about physical reboot of host machine and host OS) ... Set up Libvirt to use Huge Pages KVM_HUGEPAGES=1 ... Setting up a guest to use Huge Pages

Lack of hypervisor support for large pages: Finally, hypervisor vendors can take a few production cycles before fully adopting large pages. For example, VMware’s ESX server currently has no support for 1GB large pages in the hypervisor, even though guests on x86-64 systems can use them.

We find that large pages are conflicted with lightweight memory management across a range of hypervisors (e.g., ESX, KVM) across architectures (e.g., ARM, x86-64) and container-based technologies.

VMware ESX Server 3.5 and VMware ESX Server 3i v3.5 introduce 2MB large page support to the virtualized environment. In earlier versions of ESX Server, guest operating system large pages were emulated using small pages. This meant that, even if the guest operating system was using large pages, it did not get the performance benefit of reducing TLB misses. The enhanced large page support in ESX Server 3.5 and ESX Server 3i v3.5 enables 32‐bit virtual machines in PAE mode and 64‐bit virtual machines to make use of large pages.

直通主机 cpu 到 VM 为我工作,这给了 VM pdpe1gb cpu 标志。

我使用 Qemu + libvirt,在主机上启用 1G hugepagesz。

也许有用。在 xml 中设置 cpu fuature 描述 vm 如下:

  <cpu mode='custom' match='exact' check='partial'>
      <model fallback='allow'>Broadwell</model>
      <feature policy='force' name='pdpe1gb'/>
  </cpu>