如何从 kubernetes pods 在 Stackdriver 中设置错误报告？

Question

我对如何在 kubernetes 中设置错误报告感到有点困惑，所以在 Google Cloud Console / Stackdriver "Error Reporting"?

中可以看到错误

根据文档 https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine 我们需要启用 fluentd' "forward input plugin" 然后从我们的应用程序发送异常数据。我认为如果我们自己设置流畅，这种方法会奏效，但它已经预安装在仅运行 gcr 的 pod 中的每个节点上。io/google_containers/fluentd-gcp docker image.

我们如何在这些 pods 上启用转发输入并确保节点上的每个 pod 都可以使用 http 端口？我们还需要确保在向集群添加更多节点时默认使用此配置。

如有任何帮助，我们将不胜感激，我是不是从错误的角度看待这一切？

Answer 1

基本思想是启动一个单独的 pod，通过 TCP 接收结构化日志并将其转发到 Cloud Logging，类似于本地运行 fluentd 代理。请参阅下面我使用的步骤。

（不幸的是，无法使用 Docker 和 Kubernetes 中内置的日志记录支持 - 它只是将来自 stdout/stderr 的单独文本行作为单独的日志条目转发，这会阻止错误报告看到完整的堆栈跟踪。）

使用 Dockerfile 为 fluentd 转发器创建一个 docker 图像，如下所示：

FROM gcr.io/google_containers/fluentd-gcp:1.18

COPY fluentd-forwarder.conf /etc/google-fluentd/google-fluentd.conf

其中 fluentd-forwarder.conf 包含以下内容：

<source>
  type forward
  port 24224
</source>

<match **>
  type google_cloud
  buffer_chunk_limit 2M
  buffer_queue_limit 24
  flush_interval 5s
  max_retry_wait 30
  disable_retry_limit
</match>

然后构建并推送镜像：

$ docker build -t gcr.io/###your project id###/fluentd-forwarder:v1 .
$ gcloud docker push gcr.io/###your project id###/fluentd-forwarder:v1

您需要一个复制控制器（fluentd-forwarder-controller.yaml）：

apiVersion: v1
kind: ReplicationController
metadata:
  name: fluentd-forwarder
spec:
  replicas: 1
  template:
    metadata:
      name: fluentd-forwarder
      labels:
        app: fluentd-forwarder
    spec:
      containers:
      - name: fluentd-forwarder
        image: gcr.io/###your project id###/fluentd-forwarder:v1
        env:
        - name: FLUENTD_ARGS
          value: -qq
        ports:
        - containerPort: 24224

您还需要服务(fluentd-forwarder-service.yaml):

apiVersion: v1
kind: Service
metadata:
  name: fluentd-forwarder
spec:
  selector:
    app: fluentd-forwarder
  ports:
  - protocol: TCP
    port: 24224

然后创建复制控制器和服务：

$ kubectl create -f fluentd-forwarder-controller.yaml
$ kubectl create -f fluentd-forwarder-service.yaml

最后，在您的应用程序中，不要像 https://cloud.google.com/error-reporting/docs/setting-up-on-compute-engine 中描述的那样使用 'localhost' 和 24224 连接到 fluentd 代理，而是使用环境变量 FLUENTD_FORWARDER_SERVICE_HOST 和 FLUENTD_FORWARDER_SERVICE_PORT.

Answer 2

添加到 Boris 的回答中：只要错误以正确的格式记录（参见 https://cloud.google.com/error-reporting/docs/troubleshooting) and Cloud Logging is enabled (you can see the errors in https://console.cloud.google.com/logs/viewer），错误就会进入错误报告，而无需任何进一步设置。

Answer 3

鲍里斯的回答很好，但比实际需要的要复杂得多（不需要构建 docker 图像）。如果您在本地机器上配置了 kubectl（或者您可以使用 Google Cloud Shell），复制并粘贴以下内容，它将在您的集群中安装转发器（我更新了 fluent- 的版本gcp 从上面的答案）。我的解决方案使用 ConfigMap 来存储文件，因此无需重建即可轻松更改它。

cat << EOF | kubectl create -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-forwarder
data:
  google-fluentd.conf: |+
    <source>
      type forward
      port 24224
    </source>

    <match **>
      type google_cloud
      buffer_chunk_limit 2M
      buffer_queue_limit 24
      flush_interval 5s
      max_retry_wait 30
      disable_retry_limit
    </match>

---
apiVersion: v1
kind: ReplicationController
metadata:
  name: fluentd-forwarder
spec:
  replicas: 1
  template:
    metadata:
      name: fluentd-forwarder
      labels:
        app: fluentd-forwarder
    spec:
      containers:
      - name: fluentd-forwarder
        image: gcr.io/google_containers/fluentd-gcp:2.0.18
        env:
        - name: FLUENTD_ARGS
          value: -qq
        ports:
        - containerPort: 24224
        volumeMounts:
        - name: config-vol
          mountPath: /etc/google-fluentd
      volumes:
        - name: config-vol
          configMap:
            name: fluentd-forwarder
---
apiVersion: v1
kind: Service
metadata:
  name: fluentd-forwarder
spec:
  selector:
    app: fluentd-forwarder
  ports:
  - protocol: TCP
    port: 24224
EOF

如何从 kubernetes pods 在 Stackdriver 中设置错误报告？

How to setup error reporting in Stackdriver from kubernetes pods?

google-compute-engine

fluentd

gcloud

kubernetes

stackdriver