配置 prometheus 以从 dockerized nodejs pod 收集自定义指标

Configure prometheus to collect custom metrics from dockerized nodejs pod

我已经设置了 prom-client(prometheus 的非官方客户端库)来收集我需要的自定义指标。 在此 eks setup guide 之后,我从 helm 部署了 prometheus 服务器。现在我正在尝试编辑默认 configmap 以收集我的应用程序指标,但出现错误

parsing YAML file /etc/config/prometheus.yml: yaml: unmarshal errors:\n line 22: field cluster_ip not found in type kubernetes.plain\n line 25: cannot unmarshal !!str默认into []string

这是我根据文档所做的 prometheus.yaml 配置映射文件

apiVersion: v1
data:
  alerting_rules.yml: |
    {}
  alerts: |
    {}
  prometheus.yml: |
    global:
      evaluation_interval: 1m
      scrape_interval: 1m
      scrape_timeout: 10s
    rule_files:
    - /etc/config/recording_rules.yml
    - /etc/config/alerting_rules.yml
    - /etc/config/rules
    - /etc/config/alerts
    scrape_configs:
    ...DEFAULT CONFIGS...
    - job_name: my_metrics
      scrape_interval: 5m
      scrape_timeout: 10s
      honor_labels: true
      metrics_path: /api/metrics
      kubernetes_sd_configs:
        - role: service
          cluster_ip: 10.100.200.92
          namespaces:
            names:
              default
  recording_rules.yml: |
    {}
  rules: |
    {}
kind: ConfigMap
metadata:
  creationTimestamp: "2020-06-08T09:26:38Z"
  labels:
    app: prometheus
    chart: prometheus-11.3.0
    component: server
    heritage: Helm
    release: prometheus
  name: prometheus-server
  namespace: prometheus
  uid: 8fadb17a-f5c5-4f9d-a931-fa1f77684847

这里的clusterIP是分配给我的服务暴露部署的IP。

我的deployment.yaml文件

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: myapp
  template:
    metadata:
      labels:
        name: myapp
    spec:
      containers:
        - image: IMAGE_URL:BUILD_NUMBER
          name: myapp
          resources:
              limits:
                cpu: "1000m"
                memory: "2400Mi"
              requests:
                cpu: "500m"
                memory: "2000Mi"
          imagePullPolicy: IfNotPresent
          ports:
              - containerPort: 5000
                name: myapp

我的 service.yaml 文件正在公开部署

apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    deploy: staging
    name: myapp
  type: ClusterIP
  ports:
    - port: 80
      targetPort: 5000
      protocol: TCP

是否有一些 different/efficient 方法可以针对我的应用程序收集指标,请告诉我。谢谢

这就是我用来在集群内启用 prometheus 抓取的工具。

在抓取配置中,我有这个片段:

      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: :
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)
          - source_labels: [__meta_kubernetes_namespace]
            action: replace
            target_label: kubernetes_namespace
          - action: labeldrop
            regex: '(kubernetes_pod|app_kubernetes_io_instance|app_kubernetes_io_name|instance)'

这直接取自 prometheus helm chart 的默认值:https://github.com/helm/charts/blob/master/stable/prometheus/values.yaml#L1452

它的作用是指示 prometheus 抓取每个具有注释的 pod: prometheus.io/scrape: "true" 放。通过 pod 上的这些注释,您可以配置抓取的端口和路径:

prometheus.io/path: "/metrics"
prometheus.io/port: "9090"

因此,您需要修改 deployment.yaml 以指定这些注释:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  strategy:
    type: RollingUpdate
  selector:
    matchLabels:
      name: myapp
  template:
    metadata:
      labels:
        name: myapp
    annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "<enter port of pod to scrape>"
      prometheus.io/path: "<enter path to scrape>"
    spec:
      containers:
        - image: IMAGE_URL:BUILD_NUMBER
...