kubectl status.phase=运行 return 错误的结果

kubectl status.phase=Running return wrong results

当我运行:

kubectl get pods --field-selector=status.phase=Running

我明白了:

NAME          READY   STATUS    RESTARTS   AGE
k8s-fbd7b     2/2     Running   0          5m5s
testm-45gfg   1/2     Error     0          22h

我不明白为什么这个命令给我的 pod 处于错误状态? 根据K8S api,没有STATUS=Error

如何只获取处于此错误状态的 pods?

当我运行:

kubectl get pods --field-selector=status.phase=Failed

它告诉我那个状态没有pods。

您可以简单地使用

grep 错误 pods
kubectl get pods --all-namespces | grep Error

从集群中删除所有错误pods

kubectl delete pod `kubectl get pods --namespace <yournamespace> | awk ' == "Error" {print }'` --namespace <yournamespace>

主要是 Pod 故障return 可以在状态字段中观察到的显式错误状态

错误:

您的 pod 崩溃了,它能够在节点上成功调度但之后崩溃了。要对其进行更多调试,您可以使用不同的方法或命令

kubectl describe pod <Pod name > -n <Namespace>

https://kubernetes.io/docs/tasks/debug-application-cluster/debug-pod-replication-controller/#my-pod-is-crashing-or-otherwise-unhealthy

使用kubectl get pods --field-selector=status.phase=Failed命令可以显示Failed阶段的所有Pods。

Failed表示Pod中所有容器已经终止,至少有一个容器失败终止(参见:Pod phase):

Failed - All containers in the Pod have terminated, and at least one container has terminated in failure. That is, the container either exited with non-zero status or was terminated by the system.

在您的示例中,两个 Pods 都处于 Running 阶段,因为在每个 Pods 中至少有一个容器仍然 运行。:

Running - The Pod has been bound to a node, and all of the containers have been created. At least one container is still running, or is in the process of starting or restarting.

您可以使用以下命令查看Pods的当前阶段:

$ kubectl get pod -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.phase}{"\n"}{end}'

让我们看看这个命令是如何工作的:

$ kubectl get pods
NAME    READY   STATUS   
app-1   1/2     Error   
app-2   0/1     Error   

$ kubectl get pod -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.phase}{"\n"}{end}'
app-1   Running
app-2   Failed

如您所见,只有 app-2 Pod 处于 Failed 阶段。 app-1 Pod中还有一个容器运行,所以这个Pod处于Running阶段

要列出状态为 Error 的所有 pods,您可以简单地使用:

$ kubectl get pods -A | grep Error
default       app-1   1/2     Error     
default       app-2   0/1     Error

此外,值得一提的是,您可以在 Pods:

中查看所有容器的状态
$ kubectl get pod -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.containerStatuses[*].state}{"\n"}{end}'
app-1   {"terminated":{"containerID":"containerd://f208e2a1ff08c5ce2acf3a33da05603c1947107e398d2f5fbf6f35d8b273ac71","exitCode":2,"finishedAt":"2021-08-11T14:07:21Z","reason":"Error","startedAt":"2021-08-11T14:07:21Z"}} {"running":{"startedAt":"2021-08-11T14:07:21Z"}}
app-2   {"terminated":{"containerID":"containerd://7a66cbbf73985efaaf348ec2f7a14d8e5bf22f891bd655c4b64692005eb0439b","exitCode":2,"finishedAt":"2021-08-11T14:08:50Z","reason":"Error","startedAt":"2021-08-11T14:08:50Z"}}

这是一个基于 go-template 的矫枉过正的尝试:

kubectl  get pods -o go-template='{{range $index, $element := .items}}{{range .status.containerStatuses}}{{range .state }}{{if .reason }}{{if (eq  .reason "Error") }}{{$element.metadata.name}} {{$element.metadata.namespace}}{{"\n"}}{{end}}{{end}}{{end}}{{end}}{{end}}'
job1-stn45 default

我的播客状态:

k get pod
NAME                         READY   STATUS             RESTARTS   AGE
foo                          1/1     Running            1          2d11h
nginx-0                      1/1     Running            3          5d10h
nginx-2                      1/1     Running            3          5d10h
nginx-1                      1/1     Running            3          5d10h
job1-stn45                   0/1     Error              0          113m
update-test-27145740-82z7s   0/1     ImagePullBackOff   0          96m
update-test-27145500-7f2l9   0/1     ImagePullBackOff   0          5h36m