Prometheus簡介

Prometheus是一個開源的系統(tǒng)監(jiān)控和報警系統(tǒng)，現(xiàn)在已經(jīng)加入到CNCF基金會，成為繼k8s之后第二個在CNCF托管的項目，在kubernetes容器管理系統(tǒng)中，通常會搭配prometheus進行監(jiān)控，同時也支持多種exporter采集數(shù)據(jù)，還支持pushgateway進行數(shù)據(jù)上報，Prometheus性能足夠支撐上萬臺規(guī)模的集群

Prometheus原理

Prometheus由Go語言編寫而成，采用Pull方式獲取監(jiān)控信息，并提供了多維度的數(shù)據(jù)模型和靈活的查詢接口。Prometheus不僅可以通過靜態(tài)文件配置監(jiān)控對象，還支持自動發(fā)現(xiàn)機制，能通過Kubernetes、Consl、DNS等多種方式動態(tài)獲取監(jiān)控對象。在數(shù)據(jù)采集方面，借助Go語音的高并發(fā)特性，單機Prometheus可以采取數(shù)百個節(jié)點的監(jiān)控數(shù)據(jù)；在數(shù)據(jù)存儲方面，隨著本地時序數(shù)據(jù)庫的不斷優(yōu)化，單機Prometheus每秒可以采集一千萬個指標，如果需要存儲大量的歷史監(jiān)控數(shù)據(jù)，則還支持遠程存儲。

Prometheus的特點

多維度數(shù)據(jù)模型：
每一個時間序列數(shù)據(jù)都由metric度量指標名稱和它的標簽labels鍵值對集合唯一確定：這個metric度量指標名稱指定監(jiān)控目標系統(tǒng)的測量特征（如：http_requests_total- 接收http請求的總計數(shù)）。labels開啟了Prometheus的多維數(shù)據(jù)模型：對于相同的度量名稱，通過不同標簽列表的結合, 會形成特定的度量維度實例。(例如：所有包含度量名稱為/api/tracks的http請求，打上method=POST的標簽，則形成了具體的http請求)。這個查詢語言在這些度量和標簽列表的基礎上進行過濾和聚合。改變?nèi)魏味攘可系娜魏螛撕炛担瑒t會形成新的時間序列圖
靈活的查詢語言（PromQL）：可以對采集的metrics指標進行加法，乘法，連接等操作
可以直接在本地部署，不依賴其他分布式存儲
通過基于HTTP的pull方式采集時序數(shù)據(jù)
可以通過中間網(wǎng)關pushgateway的方式把時間序列數(shù)據(jù)推送到prometheus server端
可通過服務發(fā)現(xiàn)或者靜態(tài)配置來發(fā)現(xiàn)目標服務對象（targets）
有多種可視化圖像界面，如Grafana等
高效的存儲，每個采樣數(shù)據(jù)占3.5 bytes左右，300萬的時間序列，30s間隔，保留60天，消耗磁盤大概200G
做高可用，可以對數(shù)據(jù)做異地備份，聯(lián)邦集群，部署多套prometheus，pushgateway上報數(shù)據(jù)

什么是樣本

樣本：在時間序列中的每一個點稱為一個樣本（sample），樣本由以下三部分組成：

指標（metric）：指標名稱和描述當前樣本特征的 labelsets
時間戳（timestamp）：一個精確到毫秒的時間戳
樣本值（value）：一個 folat64 的浮點型數(shù)據(jù)表示當前樣本的值

表示方式：通過如下表達方式表示指定指標名稱和指定標簽集合的時間序列：<metric name>{<label name>=<label value>, ...}

例如：指標名稱為 api_http_requests_total，標簽為 method="POST" 和 handler="/messages"的時間序列可以表示為：api_http_requests_total{method="POST", handler="/messages"}

Prometheus優(yōu)缺點：

提供多維度數(shù)據(jù)模型和靈活的查詢方式，通過將監(jiān)控指標關聯(lián)多個tag，來將監(jiān)控數(shù)據(jù)進行任意維度的組合，并且提供簡單的PromQL查詢方式，還提供HTTP查詢接口，可以很方便地結合Grafana等GUI組件展示數(shù)據(jù)
在不依賴外部存儲的情況下，支持服務器節(jié)點的本地存儲，通過Prometheus自帶的時序數(shù)據(jù)庫，可以完成每秒千萬級的數(shù)據(jù)存儲；不僅如此，在保存大量歷史數(shù)據(jù)的場景中，Prometheus可以對接第三方時序數(shù)據(jù)庫和OpenTSDB等
定義了開放指標數(shù)據(jù)標準，以基于HTTP的Pull方式采集時序數(shù)據(jù)，只有實現(xiàn)了Prometheus監(jiān)控數(shù)據(jù)才可以被Prometheus采集、匯總、并支持Push方式向中間網(wǎng)關推送時序列數(shù)據(jù)，能更加靈活地應對多種監(jiān)控場景
支持通過靜態(tài)文件配置和動態(tài)發(fā)現(xiàn)機制發(fā)現(xiàn)監(jiān)控對象，自動完成數(shù)據(jù)采集。Prometheus目前已經(jīng)支持Kubernetes、etcd、Consul等多種服務發(fā)現(xiàn)機制
易于維護，可以通過二進制文件直接啟動，并且提供了容器化部署鏡像。
支持數(shù)據(jù)的分區(qū)采樣和聯(lián)邦部署，支持大規(guī)模集群監(jiān)控

Prometheus 架構

Prometheus的基本原理是通過HTTP周期性抓取被監(jiān)控組件的狀態(tài)，任意組件只要提供對應的HTTP接口并符合Prometheus定義的數(shù)據(jù)格式，就可以介入Prometheus監(jiān)控

Prometheus Server負載定時在目標上抓取metrics(指標)數(shù)據(jù)，每個抓取目標都需要暴露一個HTTP服務接口用于Prometheus定時抓取。這種調(diào)用被監(jiān)控對象獲取監(jiān)控數(shù)據(jù)的方式被稱為Pull(拉)。Pull方式體現(xiàn)了Prometheus獨特的設計哲學與大多數(shù)采用Push(推)方式的監(jiān)控不同

Pull方式的優(yōu)勢是能夠自動進行上游監(jiān)控和水平監(jiān)控，配置更少，更容易擴展，更靈活，更容易實現(xiàn)高可用。簡單來說就是Pull方式可以降低耦合。由于在推送系統(tǒng)中很容易出現(xiàn)因為向監(jiān)控系統(tǒng)推送數(shù)據(jù)失敗而導致被監(jiān)控系統(tǒng)癱瘓的問題。所以通過Pull方式，被采集端無需感知監(jiān)控系統(tǒng)的存在，完全獨立于監(jiān)控系統(tǒng)之外，這樣數(shù)據(jù)的采集完全由監(jiān)控系統(tǒng)控制

Prometheus支持兩種Pull方式采集數(shù)據(jù)

通過配置文件、文本等進行靜態(tài)配置
支持Zookeeper、Consul、Kubernetes等方式進行動態(tài)發(fā)現(xiàn)，例如對Kuernetes的動態(tài)發(fā)現(xiàn)，Prometheus使用Kubernetes的API查詢和監(jiān)控容器信息的變化，動態(tài)更新監(jiān)控對象，這樣容器的創(chuàng)建和刪除都可以被Prometheus感知

Prometheus組件介紹

Prometheus Server: 用于收集和存儲時間序列數(shù)據(jù)
Client Library: 客戶端庫，檢測應用程序代碼，當Prometheus抓取實例的HTTP端點時，客戶端庫會將所有跟蹤的metrics指標的當前狀態(tài)發(fā)送到prometheus server端
Exporters: prometheus支持多種exporter，通過exporter可以采集metrics數(shù)據(jù)，然后發(fā)送到prometheus server端，所有向promtheus server提供監(jiān)控數(shù)據(jù)的程序都可以被稱為exporter
Alertmanager: 從 Prometheus server 端接收到 alerts 后，會進行去重，分組，并路由到相應的接收方，發(fā)出報警，常見的接收方式有：電子郵件，微信，釘釘, slack等
Grafana：監(jiān)控儀表盤，可視化監(jiān)控數(shù)據(jù)
pushgateway: 各個目標主機可上報數(shù)據(jù)到pushgateway，然后prometheus server統(tǒng)一從pushgateway拉取數(shù)據(jù)。

從上圖可發(fā)現(xiàn)，Prometheus整個生態(tài)圈組成主要包括prometheus server，Exporter，pushgateway，alertmanager，grafana，Web ui界面，Prometheus server由三個部分組成，Retrieval，Storage，PromQL

Retrieval負責在活躍的target主機上抓取監(jiān)控指標數(shù)據(jù)
Storage存儲主要是把采集到的數(shù)據(jù)存儲到磁盤中
PromQL是Prometheus提供的查詢語言模塊

Prometheus工作流程

Prometheus server可定期從活躍的（up）目標主機上（target）拉取監(jiān)控指標數(shù)據(jù)，目標主機的監(jiān)控數(shù)據(jù)可通過配置靜態(tài)job或者服務發(fā)現(xiàn)的方式被prometheus server采集到，這種方式默認的pull方式拉取指標；也可通過pushgateway把采集的數(shù)據(jù)上報到prometheus server中；還可通過一些組件自帶的exporter采集相應組件的數(shù)據(jù)
Prometheus server把采集到的監(jiān)控指標數(shù)據(jù)保存到本地磁盤或者數(shù)據(jù)庫
Prometheus采集的監(jiān)控指標數(shù)據(jù)按時間序列存儲，通過配置報警規(guī)則，把觸發(fā)的報警發(fā)送到alertmanager
Alertmanager通過配置報警接收方，發(fā)送報警到郵件，微信或者釘釘?shù)?/p>
Prometheus 自帶的web ui界面提供PromQL查詢語言，可查詢監(jiān)控數(shù)據(jù)
Grafana可接入prometheus數(shù)據(jù)源，把監(jiān)控數(shù)據(jù)以圖形化形式展示出

Prometheus的幾種部署模式

基本高可用模式

基本的HA模式只能確保Promthues服務的可用性問題，但是不解決Prometheus Server之間的數(shù)據(jù)一致性問題以及持久化問題(數(shù)據(jù)丟失后無法恢復)，也無法進行動態(tài)的擴展。因此這種部署方式適合監(jiān)控規(guī)模不大，Promthues Server也不會頻繁發(fā)生遷移的情況，并且只需要保存短周期監(jiān)控數(shù)據(jù)的場景。

基本高可用+遠程存儲

在解決了Promthues服務可用性的基礎上，同時確保了數(shù)據(jù)的持久化，當Promthues Server發(fā)生宕機或者數(shù)據(jù)丟失的情況下，可以快速的恢復。同時Promthues Server可能很好的進行遷移。因此，該方案適用于用戶監(jiān)控規(guī)模不大，但是希望能夠?qū)⒈O(jiān)控數(shù)據(jù)持久化，同時能夠確保Promthues Server的可遷移性的場景。

基本HA + 遠程存儲 + 聯(lián)邦集群方案

Promthues的性能瓶頸主要在于大量的采集任務，因此用戶需要利用Prometheus聯(lián)邦集群的特性，將不同類型的采集任務劃分到不同的Promthues子服務中，從而實現(xiàn)功能分區(qū)。例如一個Promthues Server負責采集基礎設施相關的監(jiān)控指標，另外一個Prometheus Server負責采集應用監(jiān)控指標。再有上層Prometheus Server實現(xiàn)對數(shù)據(jù)的匯聚

環(huán)境說明：

系統(tǒng)	Prometheus版本
redhat8.2	2.31.1

部署Prometheus

Prometheus官網(wǎng)

首先去Prometheus官網(wǎng)下載相應的tar包

[root@localhost opt]# wget https://github.com/prometheus/prometheus/releases/download/v2.31.1/prometheus-2.31.1.linux-amd64.tar.gz

解壓tar包

[root@localhost opt]# tar xf prometheus-2.31.1.linux-amd64.tar.gz

做軟連接是為了后面操作

[root@localhost opt]# ln -s /opt/prometheus-2.31.1.linux-amd64 /opt/prometheus

進入解壓目錄

[root@localhost prometheus]# vim prometheus.yml

修改prometheus.yml配置文件，配置要監(jiān)控的項

// 這里我們監(jiān)控另一臺Linux主機的狀態(tài)
# my global config
global:
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus"
  - job_name: "linux"  // 寫要監(jiān)控的服務或主機

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
      - targets: ["localhost:9090"]
      - targets: ["192.168.182.131:9100"]  //linux主機的IP地址

這里我們通過systemctl控制Prometheus。
提供service文件

[root@localhost prometheus]# cat /usr/lib/systemd/system/prometheus.service 
[Unit]
Description=prometheus server daemon
After=network.target 

[Service]
Type=simple
ExecStart=/opt/prometheus/prometheus \
  --config.file=/opt/prometheus/prometheus.yml \
  --web.read-timeout=5m \
  --web.max-connections=512 \
  --storage.tsdb.retention=15d \
  --query.timeout=2m

ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure

[Install]
WantedBy=multi-user.target

啟動Prometheus

// 重新加載service讓其生效
[root@localhost prometheus]# systemctl daemon-reload
// 開啟Prometheus服務
[root@localhost prometheus]# systemctl start prometheus.service // 設置開機自啟
[root@localhost prometheus]# systemctl enable --now prometheus.service 
Created symlink /etc/systemd/system/multi-user.target.wants/prometheus.service → /usr/lib/systemd/system/prometheus.service.
[root@localhost prometheus]# systemctl status prometheus.service 
● prometheus.service - prometheus server daemon
   Loaded: loaded (/usr/lib/systemd/system/prometheus.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2021-11-25 20:40:18 CST; 4min 23s ago
 Main PID: 1739 (prometheus)
    Tasks: 7 (limit: 23790)
   Memory: 107.3M

這里我們用IP+端口號的方式進行訪問

在192.168.182.142主機上下載exporter

[root@localhost opt]# wget https://github.com/prometheus/node_exporter/releases/download/v1.3.0/node_exporter-1.3.0.linux-amd64.tar.gz

解壓exporter包

[root@localhost opt]# tar xf node_exporter-1.3.0.linux-amd64.tar.gz

做軟件鏈接

[root@localhost opt]# ln -s /opt/node_exporter-1.3.0.linux-amd64 exporter

啟動exporter

[root@localhost opt]# nohup /opt/exporter/node_exporter &

看到下圖的State為兩個UP

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Prometheus簡介與部署