查找用于计算 docker 图像大小的源代码

Find the source code for computing size of a docker image

我听说这个数字不等于图像中所有层的大小相加。而且它也不是它占用的磁盘大小space。

现在我想通过源代码检查逻辑(在这个 repo 中:https://github.com/docker/docker-ce),因为眼见为实!但是在代码中导航了很多时间后,我发现我无法找到真正的 imag-size-computing 代码。

那么哪个function/file是用来执行大小逻辑的docker呢?

您可以在

中获得尺码
$ docker image ls
REPOSITORY  TAG                 IMAGE ID            CREATED             SIZE
nginx       1.12-alpine         24ed1c575f81        2 years ago        15.5MB

这个功能代码在这里 https://github.com/docker/docker-ce/blob/5a0987be93654b685927c2e5c2d18ac01022d20c/components/cli/cli/command/image/list.go

并从此代码获取大小 https://github.com/docker/docker-ce/blob/524986b1d978e1613bdc7b0448ba2cd16b3988b6/components/cli/cli/command/formatter/image.go

最后你需要 https://github.com/docker/docker-ce/blob/531930f3294c31db414f17f80fa8650d4ae66644/components/engine/daemon/images/images.go

在深入挖掘之前,您可能会发现了解 Linux 如何实现覆盖文件系统很有用。我在 intro presentation's build section. The demo notes 的第一个练习中包含了一些内容,其中包括我 运行 的每个命令,它让您了解层是如何合并的,以及当您 add/modify/delete从一层。


这取决于实现,具体取决于您的主机 OS 和所使用的图形驱动程序。我以 Linux OS 和 Overlay2 为例,因为这是最常见的用例。

首先看图layer storage size:

// GetContainerLayerSize returns the real size & virtual size of the container.
func (i *ImageService) GetContainerLayerSize(containerID string) (int64, int64) {
    var (
        sizeRw, sizeRootfs int64
        err                error
    )

    // Safe to index by runtime.GOOS as Unix hosts don't support multiple
    // container operating systems.
    rwlayer, err := i.layerStores[runtime.GOOS].GetRWLayer(containerID)
    if err != nil {
        logrus.Errorf("Failed to compute size of container rootfs %v: %v", containerID, err)
        return sizeRw, sizeRootfs
    }
    defer i.layerStores[runtime.GOOS].ReleaseRWLayer(rwlayer)

    sizeRw, err = rwlayer.Size()
    if err != nil {
        logrus.Errorf("Driver %s couldn't return diff size of container %s: %s",
            i.layerStores[runtime.GOOS].DriverName(), containerID, err)
        // FIXME: GetSize should return an error. Not changing it now in case
        // there is a side-effect.
        sizeRw = -1
    }

    if parent := rwlayer.Parent(); parent != nil {
        sizeRootfs, err = parent.Size()
        if err != nil {
            sizeRootfs = -1
        } else if sizeRw != -1 {
            sizeRootfs += sizeRw
        }
    }
    return sizeRw, sizeRootfs
}

其中有一个对 layerStores 的调用,它本身就是 a mapping to layer.Store:

// ImageServiceConfig is the configuration used to create a new ImageService
type ImageServiceConfig struct {
    ContainerStore            containerStore
    DistributionMetadataStore metadata.Store
    EventsService             *daemonevents.Events
    ImageStore                image.Store
    LayerStores               map[string]layer.Store
    MaxConcurrentDownloads    int
    MaxConcurrentUploads      int
    MaxDownloadAttempts       int
    ReferenceStore            dockerreference.Store
    RegistryService           registry.Service
    TrustKey                  libtrust.PrivateKey
}

深入研究 GetRWLayerlayer.Store 实现,有 following definition:

func (ls *layerStore) GetRWLayer(id string) (RWLayer, error) {
    ls.locker.Lock(id)
    defer ls.locker.Unlock(id)

    ls.mountL.Lock()
    mount := ls.mounts[id]
    ls.mountL.Unlock()
    if mount == nil {
        return nil, ErrMountDoesNotExist
    }

    return mount.getReference(), nil
}

接下来要找到挂载引用的 Size 实现,this function 进入特定的图形驱动程序:

func (ml *mountedLayer) Size() (int64, error) {
    return ml.layerStore.driver.DiffSize(ml.mountID, ml.cacheParent())
}

查看 overlay2 图形驱动程序以找到 DiffSize function:

func (d *Driver) DiffSize(id, parent string) (size int64, err error) {
    if useNaiveDiff(d.home) || !d.isParent(id, parent) {
        return d.naiveDiff.DiffSize(id, parent)
    }
    return directory.Size(context.TODO(), d.getDiffPath(id))
}

那是调用 naiveDiff 其中 implements Size in the graphDriver package:

func (gdw *NaiveDiffDriver) DiffSize(id, parent string) (size int64, err error) {
    driver := gdw.ProtoDriver

    changes, err := gdw.Changes(id, parent)
    if err != nil {
        return
    }

    layerFs, err := driver.Get(id, "")
    if err != nil {
        return
    }
    defer driver.Put(id)

    return archive.ChangesSize(layerFs.Path(), changes), nil
}

跟随archive.ChangeSize我们可以看到这个implementation:

// ChangesSize calculates the size in bytes of the provided changes, based on newDir.
func ChangesSize(newDir string, changes []Change) int64 {
    var (
        size int64
        sf   = make(map[uint64]struct{})
    )
    for _, change := range changes {
        if change.Kind == ChangeModify || change.Kind == ChangeAdd {
            file := filepath.Join(newDir, change.Path)
            fileInfo, err := os.Lstat(file)
            if err != nil {
                logrus.Errorf("Can not stat %q: %s", file, err)
                continue
            }

            if fileInfo != nil && !fileInfo.IsDir() {
                if hasHardlinks(fileInfo) {
                    inode := getIno(fileInfo)
                    if _, ok := sf[inode]; !ok {
                        size += fileInfo.Size()
                        sf[inode] = struct{}{}
                    }
                } else {
                    size += fileInfo.Size()
                }
            }
        }
    }
    return size
}

此时我们使用 os.Lstat 到 return 一个结构,该结构在每个条目上包含 Size,这是对每个目录的添加或修改。请注意,这是代码采用的几种可能路径之一,但我相信它是这种情况下更常见的路径之一。