用 Django 和 Kubernetes 部署了 prometheus,如何让它抓取 Django 应用程序?

Deployed prometheus with Django and Kubernetes, how to make it scrape the Django app?

我在 Kubernetes 中部署了一个 Django 项目,我正在尝试将 Prometheus 部署为监控工具。我已成功完成将 django_prometheus 包含在项目中所需的所有步骤,我可以在本地去 localhost:9090 并尝试查询指标。

我还将 Prometheus 部署到我的 Kubernetes 集群,在 Prometheus pod 上 运行 kubectl port-forward ... 我可以看到我的 Kubernetes 资源的一些指标。

我有点困惑的是如何使已部署的 Django 应用程序指标像其他指标一样在 Prometheus 仪表板上可用。 我在 default 命名空间中部署了我的应用程序,在 monitoring 专用命名空间中部署了 prometheus。我想知道我在这里错过了什么。是否需要根据worker的数量或类似的东西暴露服务和部署上的端口从8000到8005?

我的 Django 应用程序使用 supervisord 与 gunicorn 一起运行,如下所示:

[program:gunicorn]
command=gunicorn --reload --timeout 200000 --workers=5 --limit-request-line 0 --limit-request-fields 32768 --limit-request-field_size 0 --chdir /code/ my_app.wsgi
apiVersion: v1
kind: Service
metadata:
  name: my_app
  namespace: default
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: my-app
  sessionAffinity: None
  type: ClusterIP

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: my-app
  name: my-app-deployment
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: my-app
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - image: ...
        imagePullPolicy: IfNotPresent
        name: my-app
        ports:
        - containerPort: 80
          name: http
          protocol: TCP
      dnsPolicy: ClusterFirst
      imagePullSecrets:
      - name: regcred
      restartPolicy: Always
      schedulerName: default-scheduler
      terminationGracePeriodSeconds: 30

apiVersion: v1
data:
  prometheus.rules: |-
    ... some rules
  prometheus.yml: |-
    global:
      scrape_interval: 5s
      evaluation_interval: 5s
    rule_files:
      - /etc/prometheus/prometheus.rules
    scrape_configs:
      - job_name: prometheus
        static_configs:
        - targets:
          - localhost:9090

      - job_name: my-app
        metrics_path: /metrics
        static_configs:
          - targets:
            - localhost:8000

      - job_name: 'node-exporter'
        kubernetes_sd_configs:
          - role: endpoints
        relabel_configs:
        - source_labels: [__meta_kubernetes_endpoints_name]
          regex: 'node-exporter'
          action: keep

kind: ConfigMap
metadata:
  labels:
    name: prometheus-config
  name: prometheus-config
  namespace: monitoring

如果 promehteus 与您的应用程序安装在同一集群上,则您不必公开服务。您可以使用 Kubernetes DNS 解析与命名空间之间的应用程序通信,遵循以下规则:

SERVICENAME.NAMESPACE.svc.cluster.local

所以一种方法是将您的普罗米修斯工作目标更改为这样的东西

  - job_name: speedtest-ookla
    metrics_path: /metrics
    static_configs:
      - targets:
          - 'my_app.default.svc.cluster.local:9000'

这是“手动”方式。更好的方法是使用 prometheus kubernetes_sd_config。它会自动发现您的服务并尝试抓取它们。

参考:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config

无需在集群外暴露应用

利用 Kubernetes 服务发现,将作业添加到抓取服务,Pods,或两者:

- job_name: 'kubernetes-service-endpoints'
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: :
    target_label: __address__
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: namespace
    regex: (.+)
  - regex: __meta_kubernetes_service_label_(.+)
    action: labelmap
  - regex: 'app_kubernetes_io_(.+)'
    action: labeldrop
  - regex: 'helm_sh_(.+)'
    action: labeldrop
- job_name: 'kubernetes-pods'
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    action: replace
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: :
    target_label: __address__
  - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: namespace
    regex: (.+)
  - source_labels: [__meta_kubernetes_pod_node_name]
    action: replace
    target_label: host
    regex: (.+)
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: pod
    regex: (.+)
  - regex: __meta_kubernetes_pod_label_(.+)
    action: labelmap
  - regex: 'app_kubernetes_io_(.+)'
    action: labeldrop
  - regex: 'helm_sh_(.+)'
    action: labeldrop

然后,将服务注释为:

metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "80"
    prometheus.io/path: "/metrics"

和部署:

spec:
  template:
    metadata:
     annotations:
      prometheus.io/scrape: "true"
      prometheus.io/port: "80"
      prometheus.io/path: "/metrics"