用 Django 和 Kubernetes 部署了 prometheus,如何让它抓取 Django 应用程序?
Deployed prometheus with Django and Kubernetes, how to make it scrape the Django app?
我在 Kubernetes 中部署了一个 Django 项目,我正在尝试将 Prometheus 部署为监控工具。我已成功完成将 django_prometheus
包含在项目中所需的所有步骤,我可以在本地去 localhost:9090
并尝试查询指标。
我还将 Prometheus 部署到我的 Kubernetes 集群,在 Prometheus pod 上 运行 kubectl port-forward ...
我可以看到我的 Kubernetes 资源的一些指标。
我有点困惑的是如何使已部署的 Django 应用程序指标像其他指标一样在 Prometheus 仪表板上可用。
我在 default
命名空间中部署了我的应用程序,在 monitoring
专用命名空间中部署了 prometheus。我想知道我在这里错过了什么。是否需要根据worker的数量或类似的东西暴露服务和部署上的端口从8000到8005?
我的 Django 应用程序使用 supervisord
与 gunicorn 一起运行,如下所示:
[program:gunicorn]
command=gunicorn --reload --timeout 200000 --workers=5 --limit-request-line 0 --limit-request-fields 32768 --limit-request-field_size 0 --chdir /code/ my_app.wsgi
my_app
服务:
apiVersion: v1
kind: Service
metadata:
name: my_app
namespace: default
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: my-app
sessionAffinity: None
type: ClusterIP
deployment.yaml
的精简版
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: my-app
name: my-app-deployment
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: my-app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: my-app
spec:
containers:
- image: ...
imagePullPolicy: IfNotPresent
name: my-app
ports:
- containerPort: 80
name: http
protocol: TCP
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: regcred
restartPolicy: Always
schedulerName: default-scheduler
terminationGracePeriodSeconds: 30
prometheus configmap
apiVersion: v1
data:
prometheus.rules: |-
... some rules
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- job_name: my-app
metrics_path: /metrics
static_configs:
- targets:
- localhost:8000
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'node-exporter'
action: keep
kind: ConfigMap
metadata:
labels:
name: prometheus-config
name: prometheus-config
namespace: monitoring
如果 promehteus 与您的应用程序安装在同一集群上,则您不必公开服务。您可以使用 Kubernetes DNS 解析与命名空间之间的应用程序通信,遵循以下规则:
SERVICENAME.NAMESPACE.svc.cluster.local
所以一种方法是将您的普罗米修斯工作目标更改为这样的东西
- job_name: speedtest-ookla
metrics_path: /metrics
static_configs:
- targets:
- 'my_app.default.svc.cluster.local:9000'
这是“手动”方式。更好的方法是使用 prometheus kubernetes_sd_config
。它会自动发现您的服务并尝试抓取它们。
参考:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
无需在集群外暴露应用
利用 Kubernetes 服务发现,将作业添加到抓取服务,Pods,或两者:
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: :
target_label: __address__
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
regex: (.+)
- regex: __meta_kubernetes_service_label_(.+)
action: labelmap
- regex: 'app_kubernetes_io_(.+)'
action: labeldrop
- regex: 'helm_sh_(.+)'
action: labeldrop
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: :
target_label: __address__
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
regex: (.+)
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: host
regex: (.+)
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod
regex: (.+)
- regex: __meta_kubernetes_pod_label_(.+)
action: labelmap
- regex: 'app_kubernetes_io_(.+)'
action: labeldrop
- regex: 'helm_sh_(.+)'
action: labeldrop
然后,将服务注释为:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/metrics"
和部署:
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/metrics"
我在 Kubernetes 中部署了一个 Django 项目,我正在尝试将 Prometheus 部署为监控工具。我已成功完成将 django_prometheus
包含在项目中所需的所有步骤,我可以在本地去 localhost:9090
并尝试查询指标。
我还将 Prometheus 部署到我的 Kubernetes 集群,在 Prometheus pod 上 运行 kubectl port-forward ...
我可以看到我的 Kubernetes 资源的一些指标。
我有点困惑的是如何使已部署的 Django 应用程序指标像其他指标一样在 Prometheus 仪表板上可用。
我在 default
命名空间中部署了我的应用程序,在 monitoring
专用命名空间中部署了 prometheus。我想知道我在这里错过了什么。是否需要根据worker的数量或类似的东西暴露服务和部署上的端口从8000到8005?
我的 Django 应用程序使用 supervisord
与 gunicorn 一起运行,如下所示:
[program:gunicorn]
command=gunicorn --reload --timeout 200000 --workers=5 --limit-request-line 0 --limit-request-fields 32768 --limit-request-field_size 0 --chdir /code/ my_app.wsgi
my_app
服务:
apiVersion: v1
kind: Service
metadata:
name: my_app
namespace: default
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: my-app
sessionAffinity: None
type: ClusterIP
deployment.yaml
的精简版
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: my-app
name: my-app-deployment
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: my-app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: my-app
spec:
containers:
- image: ...
imagePullPolicy: IfNotPresent
name: my-app
ports:
- containerPort: 80
name: http
protocol: TCP
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: regcred
restartPolicy: Always
schedulerName: default-scheduler
terminationGracePeriodSeconds: 30
prometheus configmap
apiVersion: v1
data:
prometheus.rules: |-
... some rules
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- job_name: my-app
metrics_path: /metrics
static_configs:
- targets:
- localhost:8000
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'node-exporter'
action: keep
kind: ConfigMap
metadata:
labels:
name: prometheus-config
name: prometheus-config
namespace: monitoring
如果 promehteus 与您的应用程序安装在同一集群上,则您不必公开服务。您可以使用 Kubernetes DNS 解析与命名空间之间的应用程序通信,遵循以下规则:
SERVICENAME.NAMESPACE.svc.cluster.local
所以一种方法是将您的普罗米修斯工作目标更改为这样的东西
- job_name: speedtest-ookla
metrics_path: /metrics
static_configs:
- targets:
- 'my_app.default.svc.cluster.local:9000'
这是“手动”方式。更好的方法是使用 prometheus kubernetes_sd_config
。它会自动发现您的服务并尝试抓取它们。
参考:https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
无需在集群外暴露应用
利用 Kubernetes 服务发现,将作业添加到抓取服务,Pods,或两者:
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: :
target_label: __address__
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
regex: (.+)
- regex: __meta_kubernetes_service_label_(.+)
action: labelmap
- regex: 'app_kubernetes_io_(.+)'
action: labeldrop
- regex: 'helm_sh_(.+)'
action: labeldrop
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: :
target_label: __address__
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
regex: (.+)
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: host
regex: (.+)
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod
regex: (.+)
- regex: __meta_kubernetes_pod_label_(.+)
action: labelmap
- regex: 'app_kubernetes_io_(.+)'
action: labeldrop
- regex: 'helm_sh_(.+)'
action: labeldrop
然后,将服务注释为:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/metrics"
和部署:
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/metrics"