如何在 Kubernetes 1.1 上使用 Fluentd/Elasticsearch/Ikebana 实现日志记录?

How to implement logging with Fluentd/Elasticsearch/Ikebana on Kubernetes 1.1?

我正在尝试在我的 AWS/CoreOS/Kubernetes (1.1) 设置上实现日志记录,我在没有 kube-up 的情况下设置了它。到目前为止,我已经将 fluentd 安装为静态 pod on all nodes and the Fluentd-Elasticsearch addon 复制控制器和服务。但是,它还不起作用。具体来说,Kibana 崩溃是这样的:

ELASTICSEARCH_URL=http://elasticsearch-logging.kube-system:9200
{"@timestamp":"2016-03-14T22:54:04.906Z","level":"error","message":"Service Unavailable","node_env":"production","error":{"message":"Service Unavailable","name":"Error","stack":"Error: Service Unavailable\n  at respond (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:235:15)\n  at checkRespForFailure (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:203:7)\n  at HttpConnector.<anonymous> (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/connectors/http.js:156:7)\n  at IncomingMessage.bound (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/node_modules/lodash-node/modern/internals/baseBind.js:56:17)\n  at IncomingMessage.emit (events.js:117:20)\n  at _stream_readable.js:944:16\n  at process._tickCallback (node.js:442:13)\n"}}
{"@timestamp":"2016-03-14T22:54:04.908Z","level":"fatal","message":"Service Unavailable","node_env":"production","error":{"message":"Service Unavailable","name":"Error","stack":"Error: Service Unavailable\n  at respond (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:235:15)\n  at checkRespForFailure (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/transport.js:203:7)\n  at HttpConnector.<anonymous> (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/src/lib/connectors/http.js:156:7)\n  at IncomingMessage.bound (/kibana-4.0.2-linux-x64/src/node_modules/elasticsearch/node_modules/lodash-node/modern/internals/baseBind.js:56:17)\n  at IncomingMessage.emit (events.js:117:20)\n  at _stream_readable.js:944:16\n  at process._tickCallback (node.js:442:13)\n"}}

我该怎么办?

FWIW,Elasticsearch 可在 http://elasticsearch-logging.kube-system:9200/ 连接,尽管它 return 状态 503。据我所知,这可能是问题所在。

# curl http://elasticsearch-logging.kube-system:9200/
{
  "status" : 503,
  "name" : "Puppet Master",
  "cluster_name" : "kubernetes-logging",
  "version" : {
    "number" : "1.5.2",
    "build_hash" : "62ff9868b4c8a0c45860bebb259e21980778ab1c",
    "build_timestamp" : "2015-04-27T09:21:06Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

fluentd-es.yaml

apiVersion: v1
kind: Pod
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
spec:
  containers:
  - name: fluentd-elasticsearch
    image: gcr.io/google_containers/fluentd-elasticsearch:1.13
    resources:
      limits:
        cpu: 100m
    args:
    - -q
    volumeMounts:
    - name: varlog
      mountPath: /var/log
    - name: varlibdockercontainers
      mountPath: /var/lib/docker/containers
      readOnly: true
  terminationGracePeriodSeconds: 30
  volumes:
  - name: varlog
    hostPath:
      path: /var/log
  - name: varlibdockercontainers
    hostPath:
      path: /var/lib/docker/containers

es-controller.yaml

apiVersion: v1
kind: ReplicationController
metadata:
  name: elasticsearch-logging-v1
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 2
  selector:
    k8s-app: elasticsearch-logging
    version: v1
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      containers:
      - image: gcr.io/google_containers/elasticsearch:1.7
        name: elasticsearch-logging         
        resources:
          limits:
            cpu: 100m
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
        - name: es-persistent-storage
          mountPath: /data
      volumes:
      - name: es-persistent-storage
        emptyDir: {}

es-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "Elasticsearch"
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch-logging

kibana-controller.yaml

apiVersion: v1
kind: ReplicationController
metadata:
  name: kibana-logging-v1
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    version: v1
    kubernetes.io/cluster-service: "true"
spec:
  replicas: 1
  selector:
    k8s-app: kibana-logging
    version: v1
  template:
    metadata:
      labels:
        k8s-app: kibana-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      containers:
      - name: kibana-logging
        image: gcr.io/google_containers/kibana:1.3
        resources:
          limits:
            cpu: 100m
        env:
          - name: "ELASTICSEARCH_URL"
            value: "http://elasticsearch-logging.kube-system:9200"
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP

kibana-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "Kibana"
spec:
  ports:
  - port: 5601
    protocol: TCP
    targetPort: ui
  selector:
    k8s-app: kibana-logging

Elasticsearch 的 503 可能是问题所在。如果一切顺利,它应该返回 200。您的第一步应该是查看 Elasticsearch 日志。您可以使用 kubectl logs POD 命令执行此操作。您的 es-controller 和 es-service YAML 似乎是正确的。

需要注意的一点是,您使用的 fluentd-elasticsearch 容器提供了适合使用 syslog 的系统的配置,而 CoreOS 使用 systemd/journald。只要您使用默认的 json 文件日志驱动程序,该配置可能仍会为您提供来自 docker 容器的日志,但不会提供系统日志。要获取系统日志,您必须使用其他方式,例如 https://github.com/reevoo/fluent-plugin-systemd or https://github.com/mheese/journalbeat.