YARN的DRF解释

Explanation of YARN's DRF

我正在阅读第 4 版的 "Hadoop The Definitive Guide",并偶然发现了 YARN 的 DRF 的解释(在第 4 章,主导资源公平性中)

Imagine a cluster with a total of 100 CPUs and 10 TB of memory. Application A requests containers of (2 CPUs, 300 GB), and application B requests containers of (6 CPUs, 100 GB). A’s request is (2%, 3%) of the cluster, so memory is dominant since its proportion (3%) is larger than CPU’s (2%). B’s request is (6%, 1%), so CPU is dominant. Since B’s container requests are twice as big in the dominant resource (6% versus 3%), it will be allocated half as many containers under fair sharing.

我无法理解it will be allocated half as many containers under fair sharing的意思。我猜这里的itApplication BApplication B分配了应用程序A的容器数量的一半。这样对吗?为什么 Application B 分配了较小的容器,即使它需要更多资源?

任何对某些解释文档的建议和指示都将不胜感激。提前谢谢你。

主导资源计算器基于主导资源公平 (DRF) 的概念。

要了解DRF,可以参考这里的论文:https://people.eecs.berkeley.edu/~alig/papers/drf.pdf

本文参考4.1节,给出了例子

DRF 尝试平衡主导份额(A 的内存要求 = CPU B 的要求)。

说明

Total Resouces Available: 100 CPUs, 10000 GB 内存

Requirements of Application A:2 CPUs,300 GB 内存

Requirements of Application B:6 CPU 秒,100 GB 内存

A's dominant resource is Memory(CPUs 的 2% vs 内存的 3%)

B's dominant resource is CPU(CPUs 的 6% 与内存的 1%)

我们假设 "A" 分配了 x 个容器,"B" 分配了 y 个容器。

  1. A 的资源要求

    2x CPUs + 300x GB Memory (2 CPUs and 300 GB Memory for each container)
    
  2. B 的资源需求:

    6y CPUs + 100y GB Memory (6 CPUs and 100 GB Memory for each container)
    
  3. 总需求为:

    2x + 6y <= 100 CPUs
    
    300x + 100y <= 10000 GB Memory
    
  4. DRF 将尝试平衡 A 和 B 的主要需求。

    A's dominant need: 300x / 10000 GB (300x out of 10000 GB of total memory)
    
    B's dominant need: 6y / 100 CPUs (6y out of 100 CPUs)
    
    DRF will try to equalise: (300x / 10000) = (6y / 100)
    
    Solving the above equation gives: x = 2y
    

如果您代入 x = 2y 并求解步骤 3 中的方程式,您将得到 x=20 和 y=10。

意思是:

  • Application A is allocated 20 containers: (40 CPUs, 6000 GB of Memory)
  • Application B is allocated 10 containers: (60 CPUs, 1000 GB of memoty)

你可以看到:

Total allocated CPU is: 40 + 60 <= 100 CPU 可用

Total allocated Memory is: 6000 + 1000 <= 10000 GB 可用内存

所以,上面的解法解释了句子的意思:

Since B’s container requests are twice as big in the dominant resource (6% 
versus 3%), it will be allocated half as many containers under fair sharing.