当 运行 Gitlab CI Kubernetes 上的运行器时 pods 的待处理状态
Pending status of pods when running Gitlab CI runner on Kubernetes
我目前正在尝试为 Gitlab 使用 Kubernetes 集群 CI。
在遵循不太好的文档 (https://docs.gitlab.com/runner/install/kubernetes.html) 时,我所做的是使用 Gitlab CI 部分中的令牌手动注册一个运行器,这样我就可以获得另一个令牌并在我用于的 ConfigMap 中使用它部署。
-ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: gitlab-runner
namespace: gitlab
data:
config.toml: |
concurrent = 4
[[runners]]
name = "Kubernetes Runner"
url = "https://url/ci"
token = "TOKEN"
executor = "kubernetes"
[runners.kubernetes]
namespace = "gitlab"
-部署
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: gitlab-runner
namespace: gitlab
spec:
replicas: 4
selector:
matchLabels:
name: gitlab-runner
template:
metadata:
labels:
name: gitlab-runner
spec:
containers:
- args:
- run
image: gitlab/gitlab-runner:latest
imagePullPolicy: Always
name: gitlab-runner
volumeMounts:
- mountPath: /etc/gitlab-runner
name: config
restartPolicy: Always
volumes:
- configMap:
name: gitlab-runner
name: config
有了这两个,我就可以在 Gitlab Runner 部分看到 runner,但是每当我开始工作时,新创建的 pods 都处于待定状态。
我想修复它,但我只知道节点和 pods 收到这些事件:
-Pods:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
35s 4s 7 {default-scheduler } Warning FailedScheduling No nodes are available that match all of the following predicates:: MatchNodeSelector (2).
-节点:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
4d 31s 6887 {kubelet gitlab-ci-hc6k3ffax54o-master-0} Warning FailedNodeAllocatableEnforcement Failed to update Node Allocatable Limits "": failed to set supported cgroup subsystems for cgroup : Failed to set config for supported subsystems : failed to write 3783761920 to memory.limit_in_bytes: write /rootfs/sys/fs/cgroup/memory/memory.limit_in_bytes: invalid argument
知道为什么会这样吗?
编辑:kubectl describe 添加:
Name: runner-45384765-project-1570-concurrent-00mb7r
Namespace: gitlab
Node: /
Labels: <none>
Status: Pending
IP:
Controllers: <none>
Containers:
build:
Image: blablabla:latest
Port:
Command:
sh
-c
if [ -x /usr/local/bin/bash ]; then
exec /usr/local/bin/bash
elif [ -x /usr/bin/bash ]; then
exec /usr/bin/bash
elif [ -x /bin/bash ]; then
exec /bin/bash
elif [ -x /usr/local/bin/sh ]; then
exec /usr/local/bin/sh
elif [ -x /usr/bin/sh ]; then
exec /usr/bin/sh
elif [ -x /bin/sh ]; then
exec /bin/sh
else
echo shell not found
exit 1
fi
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1qm5n (ro)
/vcs from repo (rw)
Environment Variables:
CI_PROJECT_DIR: blablabla
CI_SERVER: yes
CI_SERVER_TLS_CA_FILE: -----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
CI: true
GITLAB_CI: true
CI_SERVER_NAME: GitLab
CI_SERVER_VERSION: 9.5.5-ee
CI_SERVER_REVISION: cfe2d5c
CI_JOB_ID: 5625
CI_JOB_NAME: pylint
CI_JOB_STAGE: build
CI_COMMIT_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_COMMIT_REF_NAME: master
CI_COMMIT_REF_SLUG: master
CI_REGISTRY_USER: gitlab-ci-token
CI_BUILD_ID: 5625
CI_BUILD_REF: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_BEFORE_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_REF_NAME: master
CI_BUILD_REF_SLUG: master
CI_BUILD_NAME: pylint
CI_BUILD_STAGE: build
CI_PROJECT_ID: 1570
CI_PROJECT_NAME: blablabla
CI_PROJECT_PATH: blablabla
CI_PROJECT_PATH_SLUG: blablabla
CI_PROJECT_NAMESPACE: vcs
CI_PROJECT_URL: https://blablabla
CI_PIPELINE_ID: 2574
CI_CONFIG_PATH: .gitlab-ci.yml
CI_PIPELINE_SOURCE: push
CI_RUNNER_ID: 111
CI_RUNNER_DESCRIPTION: testing on kubernetes
CI_RUNNER_TAGS: docker-image-build
CI_REGISTRY: blablabla
CI_REGISTRY_IMAGE: blablabla
PYLINTHOME: ./pylint-home
GITLAB_USER_ID: 2277
GITLAB_USER_EMAIL: blablabla
helper:
Image: gitlab/gitlab-runner-helper:x86_64-a9a76a50
Port:
Command:
sh
-c
if [ -x /usr/local/bin/bash ]; then
exec /usr/local/bin/bash
elif [ -x /usr/bin/bash ]; then
exec /usr/bin/bash
elif [ -x /bin/bash ]; then
exec /bin/bash
elif [ -x /usr/local/bin/sh ]; then
exec /usr/local/bin/sh
elif [ -x /usr/bin/sh ]; then
exec /usr/bin/sh
elif [ -x /bin/sh ]; then
exec /bin/sh
else
echo shell not found
exit 1
fi
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1qm5n (ro)
/vcs from repo (rw)
Environment Variables:
CI_PROJECT_DIR: blablabla
CI_SERVER: yes
CI_SERVER_TLS_CA_FILE: -----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
CI: true
GITLAB_CI: true
CI_SERVER_NAME: GitLab
CI_SERVER_VERSION: 9.5.5-ee
CI_SERVER_REVISION: cfe2d5c
CI_JOB_ID: 5625
CI_JOB_NAME: pylint
CI_JOB_STAGE: build
CI_COMMIT_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_COMMIT_REF_NAME: master
CI_COMMIT_REF_SLUG: master
CI_REGISTRY_USER: gitlab-ci-token
CI_BUILD_ID: 5625
CI_BUILD_REF: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_BEFORE_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_REF_NAME: master
CI_BUILD_REF_SLUG: master
CI_BUILD_NAME: pylint
CI_BUILD_STAGE: build
CI_PROJECT_ID: 1570
CI_PROJECT_NAME: blablabla
CI_PROJECT_PATH: blablabla
CI_PROJECT_PATH_SLUG: blablabla
CI_PROJECT_NAMESPACE: vcs
CI_PROJECT_URL: blablabla
CI_PIPELINE_ID: 2574
CI_CONFIG_PATH: .gitlab-ci.yml
CI_PIPELINE_SOURCE: push
CI_RUNNER_ID: 111
CI_RUNNER_DESCRIPTION: testing on kubernetes
CI_RUNNER_TAGS: docker-image-build
CI_REGISTRY: blablabla
CI_REGISTRY_IMAGE: blablabla
PYLINTHOME: ./pylint-home
GITLAB_USER_ID: 2277
GITLAB_USER_EMAIL: blablabla
Conditions:
Type Status
PodScheduled False
Volumes:
repo:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-1qm5n:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-1qm5n
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
39s 8s 7 {default-scheduler } Warning FailedScheduling No nodes are available that match all of the following predicates:: MatchNodeSelector (2).
因为:
selector:
matchLabels:
name: gitlab-runner
没有 pod 能够检索具有该标签的作业。
去掉选择器就够了,没必要
@djuarez 只要部署选择器与模板部分中的 pods 标签相匹配,在这种情况下,我所看到的就是这种情况:
selector:
matchLabels:
name: gitlab-runner
template:
metadata:
labels:
name: gitlab-runner
应该没有问题;如果使用了正确的 API,在本例中 apiVersion: extensions/v1beta1
也是正确的。 describe
输出显示 MatchNodeSelector
,这与部署选择器无关。我的猜测是这里没有显示完整的部署配置,还有其他错误,例如尝试通过 nodeSeletor
将 pods 部署到特定节点,这些节点在 nodeSelector 条件中没有请求的标签。
我目前正在尝试为 Gitlab 使用 Kubernetes 集群 CI。 在遵循不太好的文档 (https://docs.gitlab.com/runner/install/kubernetes.html) 时,我所做的是使用 Gitlab CI 部分中的令牌手动注册一个运行器,这样我就可以获得另一个令牌并在我用于的 ConfigMap 中使用它部署。
-ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: gitlab-runner
namespace: gitlab
data:
config.toml: |
concurrent = 4
[[runners]]
name = "Kubernetes Runner"
url = "https://url/ci"
token = "TOKEN"
executor = "kubernetes"
[runners.kubernetes]
namespace = "gitlab"
-部署
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: gitlab-runner
namespace: gitlab
spec:
replicas: 4
selector:
matchLabels:
name: gitlab-runner
template:
metadata:
labels:
name: gitlab-runner
spec:
containers:
- args:
- run
image: gitlab/gitlab-runner:latest
imagePullPolicy: Always
name: gitlab-runner
volumeMounts:
- mountPath: /etc/gitlab-runner
name: config
restartPolicy: Always
volumes:
- configMap:
name: gitlab-runner
name: config
有了这两个,我就可以在 Gitlab Runner 部分看到 runner,但是每当我开始工作时,新创建的 pods 都处于待定状态。
我想修复它,但我只知道节点和 pods 收到这些事件:
-Pods:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
35s 4s 7 {default-scheduler } Warning FailedScheduling No nodes are available that match all of the following predicates:: MatchNodeSelector (2).
-节点:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
4d 31s 6887 {kubelet gitlab-ci-hc6k3ffax54o-master-0} Warning FailedNodeAllocatableEnforcement Failed to update Node Allocatable Limits "": failed to set supported cgroup subsystems for cgroup : Failed to set config for supported subsystems : failed to write 3783761920 to memory.limit_in_bytes: write /rootfs/sys/fs/cgroup/memory/memory.limit_in_bytes: invalid argument
知道为什么会这样吗?
编辑:kubectl describe 添加:
Name: runner-45384765-project-1570-concurrent-00mb7r
Namespace: gitlab
Node: /
Labels: <none>
Status: Pending
IP:
Controllers: <none>
Containers:
build:
Image: blablabla:latest
Port:
Command:
sh
-c
if [ -x /usr/local/bin/bash ]; then
exec /usr/local/bin/bash
elif [ -x /usr/bin/bash ]; then
exec /usr/bin/bash
elif [ -x /bin/bash ]; then
exec /bin/bash
elif [ -x /usr/local/bin/sh ]; then
exec /usr/local/bin/sh
elif [ -x /usr/bin/sh ]; then
exec /usr/bin/sh
elif [ -x /bin/sh ]; then
exec /bin/sh
else
echo shell not found
exit 1
fi
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1qm5n (ro)
/vcs from repo (rw)
Environment Variables:
CI_PROJECT_DIR: blablabla
CI_SERVER: yes
CI_SERVER_TLS_CA_FILE: -----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
CI: true
GITLAB_CI: true
CI_SERVER_NAME: GitLab
CI_SERVER_VERSION: 9.5.5-ee
CI_SERVER_REVISION: cfe2d5c
CI_JOB_ID: 5625
CI_JOB_NAME: pylint
CI_JOB_STAGE: build
CI_COMMIT_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_COMMIT_REF_NAME: master
CI_COMMIT_REF_SLUG: master
CI_REGISTRY_USER: gitlab-ci-token
CI_BUILD_ID: 5625
CI_BUILD_REF: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_BEFORE_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_REF_NAME: master
CI_BUILD_REF_SLUG: master
CI_BUILD_NAME: pylint
CI_BUILD_STAGE: build
CI_PROJECT_ID: 1570
CI_PROJECT_NAME: blablabla
CI_PROJECT_PATH: blablabla
CI_PROJECT_PATH_SLUG: blablabla
CI_PROJECT_NAMESPACE: vcs
CI_PROJECT_URL: https://blablabla
CI_PIPELINE_ID: 2574
CI_CONFIG_PATH: .gitlab-ci.yml
CI_PIPELINE_SOURCE: push
CI_RUNNER_ID: 111
CI_RUNNER_DESCRIPTION: testing on kubernetes
CI_RUNNER_TAGS: docker-image-build
CI_REGISTRY: blablabla
CI_REGISTRY_IMAGE: blablabla
PYLINTHOME: ./pylint-home
GITLAB_USER_ID: 2277
GITLAB_USER_EMAIL: blablabla
helper:
Image: gitlab/gitlab-runner-helper:x86_64-a9a76a50
Port:
Command:
sh
-c
if [ -x /usr/local/bin/bash ]; then
exec /usr/local/bin/bash
elif [ -x /usr/bin/bash ]; then
exec /usr/bin/bash
elif [ -x /bin/bash ]; then
exec /bin/bash
elif [ -x /usr/local/bin/sh ]; then
exec /usr/local/bin/sh
elif [ -x /usr/bin/sh ]; then
exec /usr/bin/sh
elif [ -x /bin/sh ]; then
exec /bin/sh
else
echo shell not found
exit 1
fi
Volume Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1qm5n (ro)
/vcs from repo (rw)
Environment Variables:
CI_PROJECT_DIR: blablabla
CI_SERVER: yes
CI_SERVER_TLS_CA_FILE: -----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
blablabla
-----END CERTIFICATE-----
CI: true
GITLAB_CI: true
CI_SERVER_NAME: GitLab
CI_SERVER_VERSION: 9.5.5-ee
CI_SERVER_REVISION: cfe2d5c
CI_JOB_ID: 5625
CI_JOB_NAME: pylint
CI_JOB_STAGE: build
CI_COMMIT_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_COMMIT_REF_NAME: master
CI_COMMIT_REF_SLUG: master
CI_REGISTRY_USER: gitlab-ci-token
CI_BUILD_ID: 5625
CI_BUILD_REF: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_BEFORE_SHA: ece31293f8eeb3a36a8585b79d4d21e0ebe8008f
CI_BUILD_REF_NAME: master
CI_BUILD_REF_SLUG: master
CI_BUILD_NAME: pylint
CI_BUILD_STAGE: build
CI_PROJECT_ID: 1570
CI_PROJECT_NAME: blablabla
CI_PROJECT_PATH: blablabla
CI_PROJECT_PATH_SLUG: blablabla
CI_PROJECT_NAMESPACE: vcs
CI_PROJECT_URL: blablabla
CI_PIPELINE_ID: 2574
CI_CONFIG_PATH: .gitlab-ci.yml
CI_PIPELINE_SOURCE: push
CI_RUNNER_ID: 111
CI_RUNNER_DESCRIPTION: testing on kubernetes
CI_RUNNER_TAGS: docker-image-build
CI_REGISTRY: blablabla
CI_REGISTRY_IMAGE: blablabla
PYLINTHOME: ./pylint-home
GITLAB_USER_ID: 2277
GITLAB_USER_EMAIL: blablabla
Conditions:
Type Status
PodScheduled False
Volumes:
repo:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-1qm5n:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-1qm5n
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
39s 8s 7 {default-scheduler } Warning FailedScheduling No nodes are available that match all of the following predicates:: MatchNodeSelector (2).
因为:
selector:
matchLabels:
name: gitlab-runner
没有 pod 能够检索具有该标签的作业。
去掉选择器就够了,没必要
@djuarez 只要部署选择器与模板部分中的 pods 标签相匹配,在这种情况下,我所看到的就是这种情况:
selector:
matchLabels:
name: gitlab-runner
template:
metadata:
labels:
name: gitlab-runner
应该没有问题;如果使用了正确的 API,在本例中 apiVersion: extensions/v1beta1
也是正确的。 describe
输出显示 MatchNodeSelector
,这与部署选择器无关。我的猜测是这里没有显示完整的部署配置,还有其他错误,例如尝试通过 nodeSeletor
将 pods 部署到特定节点,这些节点在 nodeSelector 条件中没有请求的标签。