使用Prometheus Operator監(jiān)控Kubernetes

原文鏈接:http://www.unmin.club/2020/07/prometheus-operator/

一、Prometheus-Operator介紹

Prometheus Operator 是了為簡化在 Kubernetes 上部署、管理和運(yùn)行 Prometheus 和 Alertmanager 集群而設(shè)計(jì)的,它為監(jiān)控 Kubernetes 資源和 Prometheus 實(shí)例的管理提供了簡單的定義,比如自帶了一些報警規(guī)則和展示看板。

Prometheus Operator 架構(gòu)圖


在這里插入圖片描述

Prometheus Operator 組件:

  • Operator:控制器,根據(jù)自定義資源來部署和管理 Prometheus Server;
  • Prometheus Server: 根據(jù)自定義資源 Prometheus 類型中定義的內(nèi)容而部署的 Prometheus Server 集群,這些自定義資源可以看作是用來管理 Prometheus Server 集群的 StatefulSets 資源;
  • Prometheus:聲明 Prometheus 資源對象期望的狀態(tài),Operator 確保這個資源對象運(yùn)行時一直與定義保持一致;
  • ServiceMonitor:聲明指定監(jiān)控的服務(wù), 也就是exporter 的抽象,通過 Labels 來選取對應(yīng)的Service Endpoint,讓 Prometheus Server 通過選取的 Service 來獲取 Metrics 信息。
  • Service:需要監(jiān)控的服務(wù),簡單的說就是 Prometheus 監(jiān)控的對象。

二、Prometheus-Operator安裝

可以使用Helm方式安裝,這里選擇的是手動安裝

下載Prometheus-Operator項(xiàng)目到本地服務(wù)器

$ git clone https://github.com/coreos/kube-prometheus.git
$ cd manifests

安裝setup目錄下的CRD和Operator對象

$ kubectl  apply  -f setup/
namespace/monitoring created
customresourcedefinition.apiextensions.k8s.io/alertmanagers.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/podmonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/probes.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheuses.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/prometheusrules.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/servicemonitors.monitoring.coreos.com created
customresourcedefinition.apiextensions.k8s.io/thanosrulers.monitoring.coreos.com created
clusterrole.rbac.authorization.k8s.io/prometheus-operator created
clusterrolebinding.rbac.authorization.k8s.io/prometheus-operator created
deployment.apps/prometheus-operator created
service/prometheus-operator created
serviceaccount/prometheus-operator created

創(chuàng)建manifests目錄下的各類資源

$ kubectl  apply -f  .
alertmanager.monitoring.coreos.com/main created
secret/alertmanager-main created
service/alertmanager-main created
serviceaccount/alertmanager-main created
.............
$ kubectl  get pods -n monitoring
NAME                                   READY   STATUS    RESTARTS   AGE
alertmanager-main-0                    1/2     Running   0          2d17h
alertmanager-main-1                    1/2     Running   0          2d17h
alertmanager-main-2                    1/2     Running   0          2d17h
grafana-86445dccbb-m7kzg               1/1     Running   0          2d17h
kube-state-metrics-5b67d79459-zf27k    3/3     Running   0          2d17h
node-exporter-blx8m                    2/2     Running   0          2d17h
node-exporter-zpns2                    2/2     Running   0          2d17h
node-exporter-zrd6g                    2/2     Running   0          2d17h
prometheus-adapter-66b855f564-mf9mc    1/1     Running   0          2d17h
prometheus-k8s-0                       3/3     Running   1          2d17h
prometheus-k8s-1                       3/3     Running   1          2d17h
prometheus-operator-78fcb48ccf-sgklz   2/2     Running   0          2d17h

三、通過Ingress訪問組件

由于這些資源的默認(rèn)Service為ClusterIP,集群外部無法訪問,我們可以通過使用kubectl edit xxx命令將Service類型修改為NodePort方式來提供外部訪問。

這里使用Ingress 分別為Prometheus,Alertmanager,Grafana創(chuàng)建域名。

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  namespace: monitoring
  name: prometheus-ingress
spec:
  rules:
  - host: k8s.grafana.com
    http:
      paths:
      - backend:
          serviceName: grafana
          servicePort: 3000
  - host: k8s.prometheus.com
    http:
      paths:
      - backend:
          serviceName: prometheus-k8s
          servicePort: 9090
  - host: k8s.alertmanager.com
    http:
      paths:
      - backend:
          serviceName: alertmanager-main
          servicePort: 9093

創(chuàng)建Ingress對象

$ kubectl  create -f ingress.yaml 
ingress.extensions/prometheus-ingress created
$ kubectl  get ingress -A
NAMESPACE    NAME                 CLASS    HOSTS                                                     ADDRESS   PORTS     AGE
default      my-nginx             <none>   nginx.ingress.com                                                   80, 443   2d21h                        
monitoring   prometheus-ingress   <none>   k8s.grafana.com,k8s.prometheus.com,k8s.alertmanager.com             80        4s

可以看到,已經(jīng)分別創(chuàng)建了相應(yīng)的域名,我們在本地的hosts文件添加Ingress主機(jī)的IP地址解析即可通過域名訪問了。


在這里插入圖片描述

在這里插入圖片描述

四、添加監(jiān)控對象

在上面Kube-proemtheus默認(rèn)監(jiān)控了一些系統(tǒng)的組件,我們還需要根據(jù)實(shí)際的業(yè)務(wù)需求去添加自定義的組件監(jiān)控,添加自定義監(jiān)控對象步驟如下:

  1. 建立一個 ServiceMonitor 對象,用于 Prometheus 添加監(jiān)控項(xiàng)
  2. 為 ServiceMonitor 對象關(guān)聯(lián) metrics 數(shù)據(jù)接口的一個 Service 對象
  3. 確保 Service 對象可以正確獲取到 metrics 數(shù)據(jù)

比如對etcd服務(wù)進(jìn)行監(jiān)控,先創(chuàng)建ServiceMonitor對象

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
 name: etcd-k8s
 namespace: monitoring
 labels:
   k8s-app: etcd-k8s
spec:
 jobLabel: k8s-app
 endpoints:
 - port: port
   interval: 15s
 selector:
   matchLabels:
     k8s-app: etcd
 namespaceSelector:
   matchNames:
   - kube-system

定義為:匹配 kube-system 這個命名空間下面的具有k8s-app=etcd 這個 label 標(biāo)簽的 Service,其中jobLabel 表示用于檢索 job 任務(wù)名稱的標(biāo)簽。

創(chuàng)建這個serviceMonitor對象

$ kubectl apply -f prometheus-serviceMonitorEtcd.yaml
servicemonitor.monitoring.coreos.com "etcd-k8s" created

然后再創(chuàng)建一個Etcd的Service 對象

apiVersion: v1
kind: Service
metadata:
  name: etcd-k8s
  namespace: kube-system
  labels:
    k8s-app: etcd
spec:
  type: ClusterIP
  clusterIP: None  # 一定要設(shè)置 clusterIP:None
  ports:
  - name: port
    port: 2381
---
apiVersion: v1
kind: Endpoints
metadata:
  name: etcd-k8s
  namespace: kube-system
  labels:
    k8s-app: etcd
subsets:
- addresses:
  - ip: 192.168.16.173  # 指定etcd節(jié)點(diǎn)地址,如果是集群則繼續(xù)向下添加
    nodeName: etc-master
  ports:
  - name: port
    port: 2381

上面文件定義為:將后端的Etcd服務(wù)通過Endpoints添加到集群,然后為其創(chuàng)建Service對象

創(chuàng)建這個Service對象

$ kubectl apply -f etcd-service.yaml
service/etcd-k8s configured
endpoints/etcd-k8s configured
$ kubectl get svc -n kube-system -l k8s-app=etcd
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
etcd-k8s   ClusterIP   None         <none>        2381/TCP   1d

創(chuàng)建完成后去prommetheus查看targes
[圖片上傳失敗...(image-477e83-1603792328401)]
可以看到2381 端口鏈接被拒絕,這是因?yàn)樵趧?chuàng)建etcd服務(wù)時metrics接口設(shè)置為- --listen-metrics-urls=http://127.0.0.1:2381,我們只需要修改 /etc/kubernetes/manifest/ 目錄下面的 etcd.yaml 文件中將上面的listen-metrics-urls更改成節(jié)點(diǎn) IP 即可:

- --listen-metrics-urls=http://192.168.16.173:2381

修改后etcd會自動重啟,然后在去Prometheus查看是否正常

image.png

然后在Grafana導(dǎo)入編號為 3070 的 dashboard,就可以獲取到 etcd 的監(jiān)控圖表:(grafana默認(rèn)口令為admin/admin)


在這里插入圖片描述

五、自定義報警規(guī)則

1. 指定Alertmanager地址

在之前使用部署Prometheus時,我們只需要修改Prometheus下的Prometheus.yaml 配置文件來指定Alertmanager地址即可?,F(xiàn)在通過Operator方式部署的Prometheus如何指定呢?我們可以先去Prometheus的web頁面上查看Configuration的配置信息

在這里插入圖片描述

可以看到上面 alertmanagers 的配置是通過 role 為 endpoints 的 kubernetes 的自動發(fā)現(xiàn)機(jī)制獲取的,匹配的是服務(wù)名為 alertmanager-main,端口名為 web 的 Service 服務(wù),所以Prometheus就這樣指定了Alertmanager的地址。

2. 添加報警規(guī)則

在上面的配置中可以看到規(guī)則文件的路徑為: /etc/prometheus/rules/prometheus-k8s-rulefiles-0/*.yaml,我們可以進(jìn)入Prometheus的容器中查看:

$ kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
Defaulting container name to prometheus.
Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
/prometheus $ ls /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
monitoring-prometheus-k8s-rules.yaml
/prometheus $ cat /etc/prometheus/rules/prometheus-k8s-rulefiles-0/monitoring-pr
ometheus-k8s-rules.yaml
groups:
- name: k8s.rules
  rules:
  - expr: |
      sum(rate(container_cpu_usage_seconds_total{job="kubelet", image!="", container_name!=""}[5m])) by (namespace)
    record: namespace:container_cpu_usage_seconds_total:sum_rate
......

而這個文件實(shí)際上就是我們之前創(chuàng)建的一個 PrometheusRule 文件包含的內(nèi)容:

$ cat manifests/prometheus-rules.yaml |  head -10
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: prometheus-k8s-rules
  namespace: monitoring
spec:
  groups:

這個文件中有非常重要的一個屬性ruleSelector,用來匹配 rule 規(guī)則的過濾器,要求匹配具有 prometheus=k8srole=alert-rules 標(biāo)簽的 PrometheusRule 資源對象。

ruleSelector:
  matchLabels:
    prometheus: k8s
    role: alert-rules

所以我們想要添加規(guī)則時,只需要創(chuàng)建一個具有prometheus=k8srole=alert-rules標(biāo)簽的 PrometheusRule 對象就可以了,比如我們對剛才添加的etcd服務(wù)編寫一條是否可用的報警規(guī)則。

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  labels:
    prometheus: k8s
    role: alert-rules
  name: etcd-rules
  namespace: monitoring
spec:
  groups:
  - name: etcd
    rules:
    - alert: EtcdClusterUnavailable
      annotations:
        summary: etcd cluster small
        description: If one more etcd peer goes down the cluster will be unavailable
      expr: |
        count(up{job="etcd"} == 0) > (count(up{job="etcd"}) / 2 - 1)
      for: 3m
      labels:
        severity: critical

創(chuàng)建這個PrometheusRule對象,然后查看Prometheus容器是否有這個報警規(guī)則文件

$ kubectl  create -f etcd-rules.yaml 
prometheusrule.monitoring.coreos.com/etcd-rules created
[root@k8s-master01 manifests]# kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring
$ kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
Defaulting container name to prometheus.
Use 'kubectl describe pod/prometheus-k8s-0 -n monitoring' to see all of the containers in this pod.
/prometheus $ ls  /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
monitoring-etcd-rules.yaml            monitoring-prometheus-k8s-rules.yaml

然后再去prometheus的web頁面查看下

在這里插入圖片描述

六、配置微信方式報警

現(xiàn)在報警規(guī)則有了,但是報警渠道還沒有配置,所以下面我們來修改下Alertmanager的配置,首先我們可以去 Alertmanager 的頁面上 status 路徑下面查看 AlertManager 的配置信息:


在這里插入圖片描述

這些配置的來由也是我們之前創(chuàng)建的alertmanger-secret文件。

$ cat  manifests/alertmanager-secret.yaml 
apiVersion: v1
data: {}
kind: Secret
metadata:
  name: alertmanager-main
  namespace: monitoring
stringData:
  alertmanager.yaml: |-
    "global":
      "resolve_timeout": "5m"
    "inhibit_rules":
    - "equal":
      - "namespace"
      - "alertname"
      "source_match":
        "severity": "critical"
      "target_match_re":
        "severity": "warning|info"
    - "equal":
      - "namespace"
      - "alertname"
      "source_match":
        "severity": "warning"
      "target_match_re":
        "severity": "info"
    "receivers":
    - "name": "Default"
    - "name": "Watchdog"
    - "name": "Critical"
    "route":
      "group_by":
      - "namespace"
      "group_interval": "5m"
      "group_wait": "30s"
      "receiver": "Default"
      "repeat_interval": "12h"
      "routes":
      - "match":
          "alertname": "Watchdog"
        "receiver": "Watchdog"
      - "match":
          "severity": "critical"
        "receiver": "Critical"
type: Opaque

然后我們就可以通過修改這個yaml文件來指定我們的報警渠道了,比如我們這里將critical級別的報警發(fā)送到微信。

apiVersion: v1
data: {}
kind: Secret
metadata:
  name: alertmanager-main
  namespace: monitoring
stringData:
  alertmanager.yaml: |-
    "global":
      "resolve_timeout": "5m"
    "inhibit_rules":
    - "equal":
      - "namespace"
      - "alertname"
      "source_match":
        "severity": "critical"
      "target_match_re":
        "severity": "warning|info"
    - "equal":
      - "namespace"
      - "alertname"
      "source_match":
        "severity": "warning"
      "target_match_re":
        "severity": "info"
    "receivers":
    - "name": "Default"
    - "name": "Watchdog"
    - "name": "Critical"
      "wechat_configs":  #添加微信的認(rèn)證
       - "corp_id": 'ww314010b4720f24'
         "to_party": '1'
         "agent_id": '1000002'
         "api_secret": '9nmYzEg8X860ZBIoOkToCbh_oNc'
         "send_resolved": true
    "route":
      "group_by":
      - "namespace"
      "group_interval": "5m"
      "group_wait": "30s"
      "receiver": "Default"
      "repeat_interval": "12h"
      "routes":
      - "match":
          "alertname": "Watchdog"
        "receiver": "Watchdog"
      - "match":
          "severity": "critical"
        "receiver": "Critical"
type: Opaque

然后強(qiáng)制更新alertmanager-secret對象

$ kubectl  delete -f alertmanager-secret.yaml 
ksecret "alertmanager-main" deleted
$ kubectl   apply -f alertmanager-secret.yaml 
secret/alertmanager-main created

然后查看alertmanger的web頁面中的配置信息是否加載


在這里插入圖片描述

如果有critical級別的報警,微信就會收到報警信息


在這里插入圖片描述

七、自動發(fā)現(xiàn)配置

當(dāng)集群中的Service和Pod越來越多時,我們再手動的為每一個服務(wù)創(chuàng)建相應(yīng)的ServiceMonitor就很麻煩了,所以為解決這個問題,Prometheus Operator 為我們提供了一個額外的抓取配置的來解決這個問題,我們可以通過添加額外的配置來進(jìn)行服務(wù)發(fā)現(xiàn)進(jìn)行自動監(jiān)控。

新建prometheus-additional.yaml

- job_name: 'kubernetes-endpoints'
  kubernetes_sd_configs:
  - role: endpoints
  relabel_configs:
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
    action: keep
    regex: true
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
    action: replace
    target_label: __scheme__
    regex: (https?)
  - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
    action: replace
    target_label: __metrics_path__
    regex: (.+)
  - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
    action: replace
    target_label: __address__
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
  - action: labelmap
    regex: __meta_kubernetes_service_label_(.+)
  - source_labels: [__meta_kubernetes_namespace]
    action: replace
    target_label: kubernetes_namespace
  - source_labels: [__meta_kubernetes_service_name]
    action: replace
    target_label: kubernetes_name
  - source_labels: [__meta_kubernetes_pod_name]
    action: replace
    target_label: kubernetes_pod_name

通過這個文件創(chuàng)建一個對應(yīng)的 Secret 對象:

$ kubectl create secret generic additional-configs --from-file=prometheus-additional.yaml -n monitoring
secret "additional-configs" created

然后我們需要在聲明 prometheus 的資源對象文件中通過additionalScrapeConfigs 屬性添加上這個額外的配置:

$ cat prometheus-prometheus.yaml
.................
  serviceAccountName: prometheus-k8s
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector: {}
  version: v2.20.0
  additionalScrapeConfigs:
    name: additional-configs
    key: prometheus-additional.yaml

添加完成后,更新 prometheus 這個 CRD 資源對象

$ kubectl apply -f prometheus-prometheus.yaml
prometheus.monitoring.coreos.com/k8s configured

然后我們就可以在Prometheus的可視化頁面查看是否加載了該配置

在這里插入圖片描述

但是在 targets 頁面下面并沒有對應(yīng)的監(jiān)控任務(wù),查看 Prometheus 的 Pod 日志:

$ kubectl logs -f prometheus-k8s-0 prometheus -n monitoring
.............
level=error ts=2020-10-27T06:01:30.129Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:361: Failed to list *v1.Endpoints: endpoints is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"endpoints\" in API group \"\" at the cluster scope"
level=error ts=2020-10-27T06:01:39.194Z caller=klog.go:94 component=k8s_client_runtime func=ErrorDepth msg="/app/discovery/kubernetes/kubernetes.go:362: Failed to list *v1.Service: services is forbidden: User \"system:serviceaccount:monitoring:prometheus-k8s\" cannot list resource \"services\" in API group \"\" at the cluster scope"

這個報錯的原因是因?yàn)镻rometheus綁定了一個名為 prometheus-k8s 的 ServiceAccount 對象,而這個ServiceAccount賬戶綁定的是一個名為 prometheus-k8s 的 ClusterRole集群角色,查看這個ClusterRole的權(quán)限

$ cat prometheus-clusterRole.yaml 
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get

可以看到這個clusterRole沒有對 Service 或者 Pod 的 list 權(quán)限,添加上對應(yīng)權(quán)限應(yīng)該就可以了。

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus-k8s
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - services
  - endpoints
  - pods
  - nodes/proxy
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  - nodes/metrics
  verbs:
  - get
- nonResourceURLs:
  - /metrics
  verbs:
  - get

重建Prometheus的所有對象

$ cd manifests
$ mkdir prometheus
$ mv prometheus-* prometheus
$ cd prometheus
$ kubectl delete -f .
$ kubectl apply  -f .

重建完成后就可以看到 targets 頁面下面有 kubernetes-endpoints 這個監(jiān)控任務(wù)了:

在這里插入圖片描述

可以看到,抓取到了kube-dns這個Service,這是因?yàn)镾ervice 中含有 prometheus.io/scrape=true這個 annotation,可以查看下kube-dns 的service信息

$  kubectl   describe svc kube-dns -n kube-system
Name:              kube-dns
Namespace:         kube-system
Labels:            k8s-app=kube-dns
                   kubernetes.io/cluster-service=true
                   kubernetes.io/name=KubeDNS
Annotations:       prometheus.io/port: 9153
                   prometheus.io/scrape: true 
Selector:          k8s-app=kube-dns
Type:              ClusterIP
IP:                10.96.0.10
Port:              dns  53/UDP
TargetPort:        53/UDP
Endpoints:         10.244.0.2:53,10.244.0.3:53
Port:              dns-tcp  53/TCP
TargetPort:        53/TCP
Endpoints:         10.244.0.2:53,10.244.0.3:53
Port:              metrics  9153/TCP
TargetPort:        9153/TCP
Endpoints:         10.244.0.2:9153,10.244.0.3:9153
Session Affinity:  None
Events:            <none>

所以我們在創(chuàng)建Service的時候就要添加 prometheus.io/scrape=true 這個annotations,才可以被Prometheus的服務(wù)發(fā)現(xiàn)抓取到,添加方式如下:

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/port: "9153"
    prometheus.io/scrape: "true"
  creationTimestamp: "2020-10-26T08:39:47Z"
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
............

注意: 雖然應(yīng)用添加了這個annotations信息,Prometheus也可以抓取到這個目標(biāo),但前提是這個應(yīng)用必須提供了metics接口來暴露指標(biāo)信息,可以通過 prometheus.io/port: "9153"這個annotations來指定metics接口,否則prometheus采集不到metics信息,則會認(rèn)為這個服務(wù)是DOWN狀態(tài)。

在這里插入圖片描述

八、使用NFS持久化數(shù)據(jù)

在上面我們重建了Prometheus的Pod,查看時會發(fā)現(xiàn)之前的數(shù)據(jù)都丟失了,這是因?yàn)槲覀兺ㄟ^ prometheus 這個 CRD 創(chuàng)建的 Prometheus 并沒有做數(shù)據(jù)的持久化,

$ kubectl get pod prometheus-k8s-0 -n monitoring -o yaml
......
    volumeMounts:
    - mountPath: /etc/prometheus/config_out
      name: config-out
      readOnly: true
    - mountPath: /prometheus
      name: prometheus-k8s-db
......
  volumes:
......
  - emptyDir: {}
    name: prometheus-k8s-db
......

可以看到 Prometheus 的數(shù)據(jù)目錄 /prometheus 實(shí)際上是通過emptyDir進(jìn)行掛載的,而emptyDir的設(shè)計(jì)就是應(yīng)用刪除后,數(shù)據(jù)也會刪除,所以我們需要對Prometheus的數(shù)據(jù)進(jìn)行持久化。

Prometheus是通過 Statefulset 控制器進(jìn)行部署的,所以我們這里通過 storageclass 來做數(shù)據(jù)持久化,這里我們選擇之前搭建的NFS StorageClass作為數(shù)據(jù)持久化。

在Prometheus的CRD對象中添加storage屬性:

$ cat  prometheus-prometheus.yaml 
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  labels:
    prometheus: k8s
  name: k8s
  namespace: monitoring
spec:
  alerting:
    alertmanagers:
    - name: alertmanager-main
      namespace: monitoring
      port: web
  storage:         #持久化存儲
    volumeClaimTemplate:
      spec:
        storageClassName: nfs-data-db
        resources:
          requests:
            storage: 10Gi
  image: quay.io/prometheus/prometheus:v2.20.0
  nodeSelector:
    kubernetes.io/os: linux
  podMonitorNamespaceSelector: {}
  podMonitorSelector: {}
  probeNamespaceSelector: {}
  probeSelector: {}
  replicas: 2
  resources:
    requests:
      memory: 400Mi
  ruleSelector:
    matchLabels:
      prometheus: k8s
      role: alert-rules
  securityContext:
    fsGroup: 2000
    runAsNonRoot: true
    runAsUser: 1000
  serviceAccountName: prometheus-k8s
  serviceMonitorNamespaceSelector: {}
  serviceMonitorSelector: {}
  version: v2.20.0
  additionalScrapeConfigs:
    name: additional-configs
    key: prometheus-additional.yaml

更新prometheus CRD 資源

$ kubectl  apply -f prometheus-prometheus.yaml 
prometheus.monitoring.coreos.com/k8s configured

查看PVC狀態(tài)

$ kubectl   get pvc -n monitoring
NAME                                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
prometheus-k8s-db-prometheus-k8s-0   Bound    pvc-8e73e8b2-1e98-452c-8aaa-a9ba694fe234   10Gi       RWO            nfs-data-db    2m
prometheus-k8s-db-prometheus-k8s-1   Bound    pvc-fe817cdd-812f-489e-b82c-d5de7f0dbf93   10Gi       RWO            nfs-data-db    2m
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

友情鏈接更多精彩內(nèi)容