Docker 图片名称是如何解析的?
How are Docker image names parsed?
在做docker push
或者拉取镜像的时候,Docker如何判断镜像名称中是否有registry server或者是默认的path/username注册表(例如 Docker Hub)?
我从 1.1 image specification 中看到以下内容:
Tag
A tag serves to map a descriptive, user-given name to any single image
ID. Tag values are limited to the set of characters [a-zA-Z_0-9].
Repository
A collection of tags grouped under a common prefix (the name component
before :). For example, in an image tagged with the name my-app:3.1.4,
my-app is the Repository component of the name. A repository name is
made up of slash-separated name components, optionally prefixed by a
DNS hostname. The hostname must follow comply with standard DNS rules,
but may not contain _ characters. If a hostname is present, it may
optionally be followed by a port number in the format :8080. Name
components may contain lowercase characters, digits, and separators. A
separator is defined as a period, one or two underscores, or one or
more dashes. A name component may not start or end with a separator.
对于 DNS 主机名,它是否需要用点完全限定,或者 "my-local-server" 是有效的注册表主机名吗?对于名称组件,我认为句点有效,这意味着 "team.user/appserver" 是有效的图像名称。如果注册表服务器在端口 80 上是 运行,因此映像名称中的主机名不需要端口号,那么主机名和注册表服务器上的路径之间似乎存在歧义。我很好奇 Docker 如何解决这种歧义。
TL;DR:主机名必须包含 .
dns 分隔符、:
端口分隔符,或者第一个 /
之前的值“localhost”。否则代码假定您需要默认注册表,Docker Hub.
在深入研究代码后,我发现 distribution/distribution/reference/reference.go 如下:
// Grammar
//
// reference := name [ ":" tag ] [ "@" digest ]
// name := [hostname '/'] component ['/' component]*
// hostname := hostcomponent ['.' hostcomponent]* [':' port-number]
// hostcomponent := /([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9])/
// port-number := /[0-9]+/
// component := alpha-numeric [separator alpha-numeric]*
// alpha-numeric := /[a-z0-9]+/
// separator := /[_.]|__|[-]*/
//
// tag := /[\w][\w.-]{0,127}/
//
// digest := digest-algorithm ":" digest-hex
// digest-algorithm := digest-algorithm-component [ digest-algorithm-separator digest-algorithm-component ]
// digest-algorithm-separator := /[+.-_]/
// digest-algorithm-component := /[A-Za-z][A-Za-z0-9]*/
// digest-hex := /[0-9a-fA-F]{32,}/ ; At least 128 bit digest value
实际实现是通过 distribution/distribution/reference/regexp.go.
中的正则表达式
但是经过一些挖掘和探索,我发现除了那个正则表达式之外还有另一个检查(例如,如果你不包含 .
或 :
).我在 distribution/distribution/reference/normalize.go:
中追踪到名称的实际拆分
// splitDockerDomain splits a repository name to domain and remotename string.
// If no valid domain is found, the default domain is used. Repository name
// needs to be already validated before.
func splitDockerDomain(name string) (domain, remainder string) {
i := strings.IndexRune(name, '/')
if i == -1 || (!strings.ContainsAny(name[:i], ".:") && name[:i] != "localhost") {
domain, remainder = defaultDomain, name
} else {
domain, remainder = name[:i], name[i+1:]
}
if domain == legacyDefaultDomain {
domain = defaultDomain
}
if domain == defaultDomain && !strings.ContainsRune(remainder, '/') {
remainder = officialRepoName + "/" + remainder
}
return
}
对我来说重要的部分是在第一个 if 语句中的第一个 /
之前检查 .
、:
或主机名 localhost
.有了它,主机名从第一个 /
之前拆分出来,没有它,整个名称将传递给默认注册表主机名。
https://github.com/moby/moby/blob/master/image/spec/v1.1.md 中的图像规范现已更新为标签限制为 128 个字符。
公关帖在这里https://github.com/docker/distribution/issues/2248
这里有一些 Ruby 代码 https://github.com/cyber-dojo/runner/blob/e98bc280c5349cb2919acecb0dfbfefa1ac4e5c3/src/docker/image_name.rb
注意:许多 URL 解析库无法解析 docker 图像引用/标签,除非它们符合标准化 URL 格式。
示例 Ansible 代码段:
- debug: #(FAILS)
msg: "{{ 'docker.io/alpine' | urlsplit() }}"
# ^-- This will fail, because the image reference isn't in standard URL format
# If you can convert the docker image reference to standard URL format
# Then most URL parsing libraries will work correctly
- debug: #(WORKS)
msg: "{{ ('https://' + 'docker.io/alpine') | urlsplit() }}"
# ^-- Example: This becomes standard URL syntax, so it parses correctly
- debug: #(FAILS)
msg: "{{ ('http://' + 'busybox:1.34.1-glibc') | urlsplit('path') }}"
# ^-- Unfortunately, this trick won't work to turn 100% of images into
# Standard URL format for parsing. (This example fails as well)
根据 BMitch 的回答,我意识到一个简单的 if 语句算法逻辑可用于将任意 docker 图像引用/标签转换为标准化 URL 格式,这允许它们被大多数库解析.
人类语言中的算法:
1. look for / in $TAG
2. If / not found
Then return ("https://docker.io/" + $TAG)
3. If / found, split $TAG into 2 parts by first /
and test text left of /, to look for ".", ":", or "localhost"
4. If (".", ":", or "localhost" found in text left of 1st /)
Then return (https://" + $TAG)
5. If (".", ":", or "localhost" not found in text left of 1st /)
Then return (https://docker.io/ + $TAG)
(This logic converts docker tags into standardized URL format
so they can be processed by URL parsing libraries.)
Bash中的算法:
vi docker_tag_to_standardized_url_format.sh
(复制粘贴以下内容)
#!/bin/bash
#This standardizes the naming of docker images
#Basically busybox --------------------> https://docker.io/busybox
# myregistry.tld/myimage:tag -> https://myregistry.tld/myimage:tag
STDIN=$(cat -)
INPUT=$STDIN
OUTPUT=""
echo "$INPUT" | grep "/" > /dev/null
if [ $? -eq 0 ]; then
echo "$INPUT" | cut -d "/" -f1 | egrep "\.|:|localhost" > /dev/null
#Note: grep considers . as wildcard, \ is escape character to treat \. as .
if [ $? -eq 0 ]; then
OUTPUT="https://$INPUT"
else
OUTPUT="https://docker.io/$INPUT"
fi
else
OUTPUT="https://docker.io/$INPUT"
fi
echo $OUTPUT
使其可执行:
chmod +x ./docker_tag_to_standardized_url_format.sh
用法示例:
# Test data, to verify against edge cases
A=docker.io/alpine
B=docker.io/rancher/system-upgrade-controller:v0.8.0
C=busybox:1.34.1-glibc
D=busybox
E=rancher/system-upgrade-controller:v0.8.0
F=localhost:5000/helloworld:latest
G=quay.io/go/go/gadget:arms
####################################
echo $A | ./docker_tag_to_standardized_url_format.sh
echo $B | ./docker_tag_to_standardized_url_format.sh
echo $C | ./docker_tag_to_standardized_url_format.sh
echo $D | ./docker_tag_to_standardized_url_format.sh
echo $E | ./docker_tag_to_standardized_url_format.sh
echo $F | ./docker_tag_to_standardized_url_format.sh
echo $G | ./docker_tag_to_standardized_url_format.sh
在做docker push
或者拉取镜像的时候,Docker如何判断镜像名称中是否有registry server或者是默认的path/username注册表(例如 Docker Hub)?
我从 1.1 image specification 中看到以下内容:
Tag
A tag serves to map a descriptive, user-given name to any single image ID. Tag values are limited to the set of characters [a-zA-Z_0-9].
Repository
A collection of tags grouped under a common prefix (the name component before :). For example, in an image tagged with the name my-app:3.1.4, my-app is the Repository component of the name. A repository name is made up of slash-separated name components, optionally prefixed by a DNS hostname. The hostname must follow comply with standard DNS rules, but may not contain _ characters. If a hostname is present, it may optionally be followed by a port number in the format :8080. Name components may contain lowercase characters, digits, and separators. A separator is defined as a period, one or two underscores, or one or more dashes. A name component may not start or end with a separator.
对于 DNS 主机名,它是否需要用点完全限定,或者 "my-local-server" 是有效的注册表主机名吗?对于名称组件,我认为句点有效,这意味着 "team.user/appserver" 是有效的图像名称。如果注册表服务器在端口 80 上是 运行,因此映像名称中的主机名不需要端口号,那么主机名和注册表服务器上的路径之间似乎存在歧义。我很好奇 Docker 如何解决这种歧义。
TL;DR:主机名必须包含 .
dns 分隔符、:
端口分隔符,或者第一个 /
之前的值“localhost”。否则代码假定您需要默认注册表,Docker Hub.
在深入研究代码后,我发现 distribution/distribution/reference/reference.go 如下:
// Grammar
//
// reference := name [ ":" tag ] [ "@" digest ]
// name := [hostname '/'] component ['/' component]*
// hostname := hostcomponent ['.' hostcomponent]* [':' port-number]
// hostcomponent := /([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9-]*[a-zA-Z0-9])/
// port-number := /[0-9]+/
// component := alpha-numeric [separator alpha-numeric]*
// alpha-numeric := /[a-z0-9]+/
// separator := /[_.]|__|[-]*/
//
// tag := /[\w][\w.-]{0,127}/
//
// digest := digest-algorithm ":" digest-hex
// digest-algorithm := digest-algorithm-component [ digest-algorithm-separator digest-algorithm-component ]
// digest-algorithm-separator := /[+.-_]/
// digest-algorithm-component := /[A-Za-z][A-Za-z0-9]*/
// digest-hex := /[0-9a-fA-F]{32,}/ ; At least 128 bit digest value
实际实现是通过 distribution/distribution/reference/regexp.go.
中的正则表达式但是经过一些挖掘和探索,我发现除了那个正则表达式之外还有另一个检查(例如,如果你不包含 .
或 :
).我在 distribution/distribution/reference/normalize.go:
// splitDockerDomain splits a repository name to domain and remotename string.
// If no valid domain is found, the default domain is used. Repository name
// needs to be already validated before.
func splitDockerDomain(name string) (domain, remainder string) {
i := strings.IndexRune(name, '/')
if i == -1 || (!strings.ContainsAny(name[:i], ".:") && name[:i] != "localhost") {
domain, remainder = defaultDomain, name
} else {
domain, remainder = name[:i], name[i+1:]
}
if domain == legacyDefaultDomain {
domain = defaultDomain
}
if domain == defaultDomain && !strings.ContainsRune(remainder, '/') {
remainder = officialRepoName + "/" + remainder
}
return
}
对我来说重要的部分是在第一个 if 语句中的第一个 /
之前检查 .
、:
或主机名 localhost
.有了它,主机名从第一个 /
之前拆分出来,没有它,整个名称将传递给默认注册表主机名。
https://github.com/moby/moby/blob/master/image/spec/v1.1.md 中的图像规范现已更新为标签限制为 128 个字符。
公关帖在这里https://github.com/docker/distribution/issues/2248
这里有一些 Ruby 代码 https://github.com/cyber-dojo/runner/blob/e98bc280c5349cb2919acecb0dfbfefa1ac4e5c3/src/docker/image_name.rb
注意:许多 URL 解析库无法解析 docker 图像引用/标签,除非它们符合标准化 URL 格式。
示例 Ansible 代码段:
- debug: #(FAILS)
msg: "{{ 'docker.io/alpine' | urlsplit() }}"
# ^-- This will fail, because the image reference isn't in standard URL format
# If you can convert the docker image reference to standard URL format
# Then most URL parsing libraries will work correctly
- debug: #(WORKS)
msg: "{{ ('https://' + 'docker.io/alpine') | urlsplit() }}"
# ^-- Example: This becomes standard URL syntax, so it parses correctly
- debug: #(FAILS)
msg: "{{ ('http://' + 'busybox:1.34.1-glibc') | urlsplit('path') }}"
# ^-- Unfortunately, this trick won't work to turn 100% of images into
# Standard URL format for parsing. (This example fails as well)
根据 BMitch 的回答,我意识到一个简单的 if 语句算法逻辑可用于将任意 docker 图像引用/标签转换为标准化 URL 格式,这允许它们被大多数库解析.
人类语言中的算法:
1. look for / in $TAG
2. If / not found
Then return ("https://docker.io/" + $TAG)
3. If / found, split $TAG into 2 parts by first /
and test text left of /, to look for ".", ":", or "localhost"
4. If (".", ":", or "localhost" found in text left of 1st /)
Then return (https://" + $TAG)
5. If (".", ":", or "localhost" not found in text left of 1st /)
Then return (https://docker.io/ + $TAG)
(This logic converts docker tags into standardized URL format
so they can be processed by URL parsing libraries.)
Bash中的算法:
vi docker_tag_to_standardized_url_format.sh
(复制粘贴以下内容)
#!/bin/bash
#This standardizes the naming of docker images
#Basically busybox --------------------> https://docker.io/busybox
# myregistry.tld/myimage:tag -> https://myregistry.tld/myimage:tag
STDIN=$(cat -)
INPUT=$STDIN
OUTPUT=""
echo "$INPUT" | grep "/" > /dev/null
if [ $? -eq 0 ]; then
echo "$INPUT" | cut -d "/" -f1 | egrep "\.|:|localhost" > /dev/null
#Note: grep considers . as wildcard, \ is escape character to treat \. as .
if [ $? -eq 0 ]; then
OUTPUT="https://$INPUT"
else
OUTPUT="https://docker.io/$INPUT"
fi
else
OUTPUT="https://docker.io/$INPUT"
fi
echo $OUTPUT
使其可执行:
chmod +x ./docker_tag_to_standardized_url_format.sh
用法示例:
# Test data, to verify against edge cases
A=docker.io/alpine
B=docker.io/rancher/system-upgrade-controller:v0.8.0
C=busybox:1.34.1-glibc
D=busybox
E=rancher/system-upgrade-controller:v0.8.0
F=localhost:5000/helloworld:latest
G=quay.io/go/go/gadget:arms
####################################
echo $A | ./docker_tag_to_standardized_url_format.sh
echo $B | ./docker_tag_to_standardized_url_format.sh
echo $C | ./docker_tag_to_standardized_url_format.sh
echo $D | ./docker_tag_to_standardized_url_format.sh
echo $E | ./docker_tag_to_standardized_url_format.sh
echo $F | ./docker_tag_to_standardized_url_format.sh
echo $G | ./docker_tag_to_standardized_url_format.sh