玖玖爱.com,日本伦理视频网

前言

關(guān)鍵詞：Prometheus; Grafana; Alertmanager; SpringBoot; SpringBoot Actuator; 監(jiān)控; 告警;

在前一篇Spring Boot Actuator 模塊詳解：健康檢查，度量，指標(biāo)收集和監(jiān)控中，我們學(xué)習(xí)了 Spring Boot Actuator 模塊的作用、配置和重要端點的介紹。

我也提到了，我主要目的是想要給我們項目的微服務(wù)應(yīng)用都加上監(jiān)控告警。Spring Boot Actuator的引入只是第一步，在本章中，我會介紹：

如何集成監(jiān)控告警系統(tǒng)Prometheus 和圖形化界面Grafana
如何自定義監(jiān)控指標(biāo)，做應(yīng)用監(jiān)控埋點
Prometheus 如何集成 Alertmanager 進行告警

image.png

理論部分

Prometheus

Prometheus 中文名稱為普羅米修斯，受啟發(fā)于Google 的Brogmon 監(jiān)控系統(tǒng)，從2012年開始由前Google工程師在Soundcloud 以開源軟件的形式進行研發(fā)，2016年6月發(fā)布1.0版本。Prometheus 可以看作是 Google 內(nèi)部監(jiān)控系統(tǒng)Borgmon 的一個實現(xiàn)。

下圖說明了Prometheus 的體系結(jié)構(gòu)及其部分生態(tài)系統(tǒng)組件。其中 Alertmanager 用于告警，Grafana 用于監(jiān)控數(shù)據(jù)可視化，會在文章后面繼續(xù)提到。

undefined

在這里我們了解到Prometheus 這幾個特征即可：

數(shù)據(jù)收集器，它以配置的時間間隔定期通過HTTP提取指標(biāo)數(shù)據(jù)。
一個時間序列數(shù)據(jù)庫，用于存儲所有指標(biāo)數(shù)據(jù)。
一個簡單的用戶界面，您可以在其中可視化，查詢和監(jiān)視所有指標(biāo)。

詳細了解請閱讀Prometheus 官方文檔

Grafana

Grafana 是一款采用 go 語言編寫的開源應(yīng)用，允許您從Elasticsearch，Prometheus，Graphite，InfluxDB等各種數(shù)據(jù)源中獲取數(shù)據(jù)，并通過精美的圖形將其可視化。

undefined

除了Prometheus的AlertManager 可以發(fā)送報警，Grafana 同時也支持告警。Grafana 可以無縫定義告警在數(shù)據(jù)中的位置，可視化的定義閾值，并可以通過釘釘、email等平臺獲取告警通知。最重要的是可直觀的定義告警規(guī)則，不斷的評估并發(fā)送通知。

由于Grafana alert告警比較弱，大部分告警都是通過Prometheus Alertmanager進行告警.

請注意Prometheus儀表板也具有簡單的圖形。但是Grafana的圖形化要好得多。

延伸閱讀：

官方文檔

Grafana全面瓦解

Alertmananger

Prometheus 監(jiān)控平臺中除了負(fù)責(zé)采集數(shù)據(jù)和存儲，還能定制事件規(guī)則，但是這些事件規(guī)則要實現(xiàn)告警通知的話需要配合Alertmanager 組件來完成。

AlertManager 支持告警分組（將多個告警合并一起發(fā)送）、告警抑制以及告警靜默（同一個時間段內(nèi)不發(fā)出重復(fù)的告警）功能。

undefined

延伸閱讀：官網(wǎng)對Alertmanager的介紹

監(jiān)控Java 應(yīng)用

監(jiān)控模式

目前，監(jiān)控系統(tǒng)采集指標(biāo)有兩種方式，一種是『推』，另一種就是『拉』：

推的代表有 ElasticSearch，InfluxDB，OpenTSDB 等，需要你從程序中將指標(biāo)使用 TCP，UDP 等方式推送至相關(guān)監(jiān)控應(yīng)用，只是使用 TCP 的話，一旦監(jiān)控應(yīng)用掛掉或存在瓶頸，容易對應(yīng)用本身產(chǎn)生影響，而使用 UDP 的話，雖然不用擔(dān)心監(jiān)控應(yīng)用，但是容易丟數(shù)據(jù)。

拉的代表，主要代表就是 Prometheus，讓我們不用擔(dān)心監(jiān)控應(yīng)用本身的狀態(tài)。而且可以利用 DNS-SRV 或者 Consul 等服務(wù)發(fā)現(xiàn)功能就可以自動添加監(jiān)控。

如何監(jiān)控

Prometheus 監(jiān)控應(yīng)用的方式非常簡單，只需要進程暴露了一個用于獲取當(dāng)前監(jiān)控樣本數(shù)據(jù)的 HTTP 訪問地址。這樣的一個程序稱為Exporter，Exporter 的實例稱為一個 Target 。Prometheus 通過輪訓(xùn)的方式定時從這些 Target 中獲取監(jiān)控數(shù)據(jù)樣本，對于應(yīng)用來講，只需要暴露一個包含監(jiān)控數(shù)據(jù)的 HTTP 訪問地址即可，當(dāng)然提供的數(shù)據(jù)需要滿足一定的格式，這個格式就是 Metrics 格式.

metric name>{<label name>=<label value>, ...}

主要分為三個部分
各個部分需符合相關(guān)的正則表達式

metric name：指標(biāo)的名稱，主要反映被監(jiān)控樣本的含義 a-zA-Z_:*_
label name: 標(biāo)簽反映了當(dāng)前樣本的特征維度 [a-zA-Z0-9_]*
label value: 各個標(biāo)簽的值，不限制格式

需要注意的是，label value 最好使用枚舉值，而不要使用無限制的值，比如用戶 ID，Email 等，不然會消耗大量內(nèi)存，也不符合指標(biāo)采集的意義。

MicroMeter

前面簡述了Prometheus 監(jiān)控的原理。那么我們的Spring Boot 應(yīng)用怎么提供這樣一個 HTTP 訪問地址，提供的數(shù)據(jù)還得符合上述的 Metrics 格式？

還記得嗎，在Spring Boot Actuator 模塊詳解：健康檢查，度量，指標(biāo)收集和監(jiān)控中，我有提到過Actuator 模塊也可以和一些外部的應(yīng)用監(jiān)控系統(tǒng)整合，其中就包括Prometheus 。那么Spring Boot Actuator 怎么讓 Spring Boot 應(yīng)用和Prometheus 這種監(jiān)控系統(tǒng)結(jié)合起來呢？

這個橋梁就是MicroMeter。Micrometer 為 Java 平臺上的性能數(shù)據(jù)收集提供了一個通用的 API，應(yīng)用程序只需要使用 Micrometer 的通用 API 來收集性能指標(biāo)即可。Micrometer 會負(fù)責(zé)完成與不同監(jiān)控系統(tǒng)的適配工作。

image.png

實操部分一

接下去我們一邊結(jié)合實際的Demo，一邊講解說明。

初始的Demo項目創(chuàng)建請參照Spring Boot Actuator 模塊詳解：健康檢查，度量，指標(biāo)收集和監(jiān)控

實操部分會將分為兩個部分，本部分主要是將應(yīng)用如何集成Prometheus 和 Grafana 完成指標(biāo)收集和可視化。

一、添加依賴

為了讓Spring Boot 應(yīng)用和Prometheus 集成，你需要增加micrometer-registry-prometheus依賴。

<!-- Micrometer Prometheus registry  -->
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

添加上述依賴項之后，Spring Boot 將會自動配置 PrometheusMeterRegistry 和 CollectorRegistry來以Prometheus 可以抓取的格式（即上文提到的 Metrics 格式）收集和導(dǎo)出指標(biāo)數(shù)據(jù)。

所有的相關(guān)數(shù)據(jù)，都會在Actuator 的 /prometheus端點暴露出來。Prometheus 可以抓取該端點以定期獲取度量標(biāo)準(zhǔn)數(shù)據(jù)。

Actuator 的 `/prometheus`端點

我們還是以我們之前的Demo項目為例子。深究一下這個端點的內(nèi)容。添加micrometer-registry-prometheus依賴后，我們訪問http://localhost:8080/actuator/prometheus地址，可以看到一下內(nèi)容：

# HELP jvm_buffer_total_capacity_bytes An estimate of the total capacity of the buffers in this pool
# TYPE jvm_buffer_total_capacity_bytes gauge
jvm_buffer_total_capacity_bytes{id="direct",} 90112.0
jvm_buffer_total_capacity_bytes{id="mapped",} 0.0
# HELP tomcat_sessions_expired_sessions_total  
# TYPE tomcat_sessions_expired_sessions_total counter
tomcat_sessions_expired_sessions_total 0.0
# HELP jvm_classes_unloaded_classes_total The total number of classes unloaded since the Java virtual machine has started execution
# TYPE jvm_classes_unloaded_classes_total counter
jvm_classes_unloaded_classes_total 1.0
# HELP jvm_buffer_count_buffers An estimate of the number of buffers in the pool
# TYPE jvm_buffer_count_buffers gauge
jvm_buffer_count_buffers{id="direct",} 11.0
jvm_buffer_count_buffers{id="mapped",} 0.0
# HELP system_cpu_usage The "recent cpu usage" for the whole system
# TYPE system_cpu_usage gauge
system_cpu_usage 0.0939447637893599
# HELP jvm_gc_max_data_size_bytes Max size of old generation memory pool
# TYPE jvm_gc_max_data_size_bytes gauge
jvm_gc_max_data_size_bytes 2.841116672E9

# 此處省略超多字...

可以看到，這些都是按照上文提到的 Metrics 格式組織起來的程序監(jiān)控指標(biāo)數(shù)據(jù)。

metric name>{<label name>=<label value>, ...}

二、Prometheus 安裝與配置

安裝請參閱官方文檔。內(nèi)容不多但是很細致。你可以選擇二進制安裝或者是docker 的方式。這里不贅述。

Prometheus官方網(wǎng)站

配置Prometheus

接下去，我們需要配置Prometheus 去收集我們 Demo 項目/actuator/prometheus的指標(biāo)數(shù)據(jù)。

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']
  # demo job
  -  job_name: 'springboot-actuator-prometheus-test' # job name
     metrics_path: '/actuator/prometheus' # 指標(biāo)獲取路徑
     scrape_interval: 5s # 間隔
     basic_auth: # Spring Security basic auth 
       username: 'actuator'
       password: 'actuator'
     static_configs:
     - targets: ['10.60.45.113:8080'] # 實例的地址，默認(rèn)的協(xié)議是http

重點請關(guān)注這里的配置：

  # demo job
  -  job_name: 'springboot-actuator-prometheus-test' # job name
     metrics_path: '/actuator/prometheus' # 指標(biāo)獲取路徑
     scrape_interval: 5s # 間隔
     basic_auth: # Spring Security basic auth 
       username: 'actuator'
       password: 'actuator'
     static_configs:
     - targets: ['10.60.45.113:8080'] # 實例的地址，默認(rèn)的協(xié)議是http

測試

配置完成之后，我們啟動Prometheus 測試一下，如果你是docker 方式的話，在prometheus.yml 文件所在目錄執(zhí)行如下命令，即可啟動Prometheus：

docker run -d -p 9090:9090 \
    -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
    prom/prometheus --config.file=/etc/prometheus/prometheus.yml

訪問http://ip:9090 ，可看到如下界面：

image.png

點擊 Insert metric at cursor ，即可選擇監(jiān)控指標(biāo)；點擊 Graph ，即可讓指標(biāo)以圖表方式展示；點擊Execute 按鈕，即可看到類似下圖的結(jié)果：

image.png

你也可以在輸入框中輸入PromQL來進行更高級的查詢。

PromQL是Prometheus 的自定義查詢語言，通過PromQL用戶可以非常方便地對監(jiān)控樣本數(shù)據(jù)進行統(tǒng)計分析。

配置熱加載

curl -X POST http://ip:9090/-/reload

三、Grafana安裝和配置

可以看到，Prometheus 自帶的監(jiān)控面板非?！昂喡?。所以引入Grafana 來實現(xiàn)更友好、更貼近生產(chǎn)的監(jiān)控可視化。

1. 啟動

$ docker run -d --name=grafana -p 3000:3000 grafana/grafana

2. 登錄

訪問 http://ip:3000/login ，初始賬號/密碼為：admin/admin ，第一次登錄會讓你修改密碼。

3. 配置數(shù)據(jù)源

點擊Configuration中Add Data Source，會看到如下界面：

image.png

這里我們選擇Prometheus 當(dāng)做數(shù)據(jù)源，這里我們就配置一下Prometheus 的訪問地址，點擊 Save & Test：

image.png

4. 創(chuàng)建監(jiān)控Dashboard

點擊導(dǎo)航欄上的 + 按鈕，并點擊Dashboard，將會看到類似如下的界面：

image.png

點擊 Add Query ，即可看到類似如下的界面：

image.png

在Metrics處輸入要查詢的指標(biāo)，指標(biāo)的取值詳見Spring Boot應(yīng)用的 /actuator/prometheus 端點，例如jvm_memory_used_bytes 、jvm_threads_states_threads 、jvm_threads_live_threads 等，Grafana會給你較好的提示，并且可以用PromQL實現(xiàn)較為復(fù)雜的計算，例如聚合、求和、平均等。如果想要繪制多個線條，可點擊Add Query 按鈕，

再點擊下面那個Visualization，可以選擇可視化的類型和一些相關(guān)的配置。這里就不多贅述，留給讀者自己探索。

image.png

再點擊下一步General進行基礎(chǔ)配置，不贅述：

image.png

5. Dashboard 市場

到這里，我想聰明的讀者們應(yīng)該已經(jīng)學(xué)會如何去可視化一個指標(biāo)數(shù)據(jù)了。但是應(yīng)該很多人都會覺得，如果有好多指標(biāo)的話，配置起來實際上是蠻繁瑣的。

是否有開箱即用、通用型的DashBoard模板呢？

前往 Grafana Lab - Dashboards ，輸入關(guān)鍵詞即可搜索指定Dashboard。你就可以獲得你想要的????。

另外，這些已有的dashboard也可以讓我們更快掌握一些panel的配置和dashboard的使用。

image.png

6. 引入dashboard

這里直接給出兩款我覺得比較好用的dashboard：

JVM (Micrometer)
Spring Boot Statistics

這一款我需要提一下，剛開始我引入的時候是無效的，不知道讀者會不會遇到和我一樣的問題，如果遇到了，請到dashboard的設(shè)置里面，修改 variables 中 $application和$instance兩個變量的Definition。

還有我個人是推薦，在這兩款dashboard上面做一些定制化操作，或者說把兩者的panel結(jié)合起來。

引入的操作很簡單，首選你要在 Grafana Lab - Dashboards中選好你心儀的dashboard，然后記下它的ID

image.png

就是點擊Import按鈕：

image.png

輸入ID 之后，完成配置，點擊Import按鈕：

image.png

效果如下：

image.png

實操部分二

在實操部分二，主要講如何自定義監(jiān)控指標(biāo)（比如我們的一些業(yè)務(wù)數(shù)據(jù)，這也叫做埋點）和如何使用Alertmanager完成監(jiān)控告警。

一、自定義（業(yè)務(wù)）監(jiān)控指標(biāo)

模擬需求：有一個訂單服務(wù)，監(jiān)控 [實時訂單金額]、[10分鐘內(nèi)下單失敗率]

1. 創(chuàng)建 Prometheus 監(jiān)控管理類`PrometheusCustomMonitor`

這里面我們自定義了三個metrics：

requests_error_total: 下單失敗次數(shù)
order_request_count：下單總次數(shù)
order_amount_sum：下單金額統(tǒng)計

@Component
public class PrometheusCustomMonitor {

    /**
     * 記錄請求出錯次數(shù)
     */
    private Counter requestErrorCount;

    /**
     * 訂單發(fā)起次數(shù)
     */
    private Counter orderCount;

    /**
     * 金額統(tǒng)計
     */
    private DistributionSummary amountSum;

    private final MeterRegistry registry;

    @Autowired
    public PrometheusCustomMonitor(MeterRegistry registry) {
        this.registry = registry;
    }

    @PostConstruct
    private void init() {
        requestErrorCount = registry.counter("requests_error_total", "status", "error");
        orderCount = registry.counter("order_request_count", "order", "test-svc");
        amountSum = registry.summary("order_amount_sum", "orderAmount", "test-svc");
    }

    public Counter getRequestErrorCount() {
        return requestErrorCount;
    }

    public Counter getOrderCount() {
        return orderCount;
    }

    public DistributionSummary getAmountSum() {
        return amountSum;
    }
}

2. 新增`/order`接口

當(dāng) flag="1"時，拋異常，模擬下單失敗情況。在接口中統(tǒng)計order_request_count和order_amount_sum。

@RestController
public class TestController {

    @Resource
    private PrometheusCustomMonitor monitor;
    
    //....

    @RequestMapping("/order")
    public String order(@RequestParam(defaultValue = "0") String flag) throws Exception {
        // 統(tǒng)計下單次數(shù)
        monitor.getOrderCount().increment();
        if ("1".equals(flag)) {
            throw new Exception("出錯啦");
        }
        Random random = new Random();
        int amount = random.nextInt(100);
        // 統(tǒng)計金額
        monitor.getAmountSum().record(amount);
        return "下單成功, 金額: " + amount;
    }
}

PS：實際項目中，采集業(yè)務(wù)監(jiān)控數(shù)據(jù)的時候，建議使用AOP的方式記錄，不要侵入業(yè)務(wù)代碼。不要像我Demo中這樣寫。

3. 新增全局異常處理器`GlobalExceptionHandler`

統(tǒng)計下單失敗次數(shù)requests_error_total：

@ControllerAdvice
public class GlobalExceptionHandler {

    @Resource
    private PrometheusCustomMonitor monitor;

    @ResponseBody
    @ExceptionHandler(value = Exception.class)
    public String handle(Exception e) {
        monitor.getRequestErrorCount().increment();
        return "error, message: " + e.getMessage();
    }
}

測試：

啟動項目，訪問http://localhost:8080/order和http://localhost:8080/order?flag=1模擬下單成功和失敗的情況，然后我們訪問http://localhost:8080/actuator/prometheus，可以看到我們自定義指標(biāo)已經(jīng)被/prometheus端點暴露出來了：

# HELP requests_error_total  
# TYPE requests_error_total counter
requests_error_total{application="springboot-actuator-prometheus-test",status="error",} 41.0
# HELP order_request_count_total  
# TYPE order_request_count_total counter
order_request_count_total{application="springboot-actuator-prometheus-test",order="test-svc",} 94.0
# HELP order_amount_sum  
# TYPE order_amount_sum summary
order_amount_sum_count{application="springboot-actuator-prometheus-test",orderAmount="test-svc",} 53.0
order_amount_sum_sum{application="springboot-actuator-prometheus-test",orderAmount="test-svc",} 2701.0

4. 在Grafana 中添加對應(yīng)監(jiān)控面板

這里我新增一個dashboard作為演示用，一些步驟前面講過這里就直接省略：

首先是創(chuàng)建10分鐘內(nèi)下單失敗率

sum(rate(requests_error_total{application="springboot-actuator-prometheus-test"}[10m])) / sum(rate(order_request_count_total{application="springboot-actuator-prometheus-test"}[10m])) * 100

image.png

然后是統(tǒng)計訂單總金額：

image.png

最終結(jié)果

image.png

二、添加監(jiān)控

模擬告警規(guī)則：

服務(wù)是否下線

10分鐘內(nèi)下單失敗率是否大于10%

1. 部署 Alertmanager

這里采用二進制包的方式部署。

Alertmanager最新版本的下載地址可以從Prometheus官方網(wǎng)站https://prometheus.io/download/獲取
下載完成后，解壓后會包含一個默認(rèn)的alertmanager.yml配置文件，我們在里面添加發(fā)送郵件配置

# 全局配置
global:
  resolve_timeout: 5m
  smtp_smarthost: 'xxxxxx'
  smtp_from: 'xxxx@xx.com'
  smtp_auth_username: 'xxxx@xx.com'
  smtp_auth_password: 'XXXXXX'
# 路由配置
route:
  receiver: 'default-receiver' # 父節(jié)點
  group_by: ['alertname'] # 分組規(guī)則
  group_wait: 10s # 為了能夠一次性收集和發(fā)送更多的相關(guān)信息時，可以通過group_wait參數(shù)設(shè)置等待時間
  group_interval: 1m  #定義相同的Group之間發(fā)送告警通知的時間間隔
  repeat_interval: 1m
  routes: # 子路由，根據(jù)match路由
  - receiver: 'rhf-mail-receiver'
    group_wait: 10s
    match: # 匹配自定義標(biāo)簽
      team: rhf    
# 告警接收者配置
receivers:
- name: 'default-receiver'
  email_configs:
  - to: 'xxxx@xx.com'
- name: 'rhf-mail-receiver'
  email_configs:
  - to: 'xxxx@xx.com'

目前官方內(nèi)置的第三方通知集成包括：郵件、即時通訊軟件（如Slack、Hipchat）、移動應(yīng)用消息推送(如Pushover)和自動化運維工具（例如：Pagerduty、Opsgenie、Victorops）。Alertmanager的通知方式中還可以支持Webhook，通過這種方式開發(fā)者可以實現(xiàn)更多個性化的擴展支持（釘釘、企業(yè)微信等）。

相關(guān)配置延伸閱讀：

延伸閱讀1

延伸閱讀2

啟動

Alermanager會將數(shù)據(jù)保存到本地中，默認(rèn)的存儲路徑為data/。因此，在啟動Alertmanager之前需要創(chuàng)建相應(yīng)的目錄：

./alertmanager

用戶也在啟動Alertmanager時使用參數(shù)修改相關(guān)配置。--config.file用于指定alertmanager配置文件路徑，--storage.path用于指定數(shù)據(jù)存儲路徑。

查看運行狀態(tài)，啟動之后我們訪問9093端口：

image.png

Alert菜單下可以查看Alertmanager 接收到的告警內(nèi)容。Silences菜單下則可以通過UI創(chuàng)建靜默規(guī)則。Status菜單下面可以看到Alertmanager 的配置信息。

配置熱加載

curl -X POST http://ip:9093/-/reload

2. 設(shè)置告警規(guī)則

在Prometheus 目錄下新建test-svc-alert-rule.yaml來設(shè)置告警規(guī)則，內(nèi)容如下：

groups:
- name: svc-alert-rule
  rules:
  - alert: svc-down # 服務(wù)是否下線
    expr: sum(up{job="springboot-actuator-prometheus-test"}) == 0
    for: 1m
    labels: # 自定義標(biāo)簽
      severity: critical
      team: rhf # 我們小組的名字，對應(yīng)上面match 的標(biāo)簽匹配
    annotations:
      summary: "訂單服務(wù)已下線，請檢查！！"
  - alert: order-error-rate-high # 10分鐘內(nèi)下單失敗率是否大于10%
    expr: sum(rate(requests_error_total{application="springboot-actuator-prometheus-test"}[10m])) / sum(rate(order_request_count_total{application="springboot-actuator-prometheus-test"}[10m])) > 0.1
    for: 1m
    labels:
      severity: major
      team: rhf
    annotations:
      summary: "訂單服務(wù)響應(yīng)異常！！"
      description: "10分鐘訂單錯誤率已經(jīng)超過10% (當(dāng)前值: {{ $value }} ！?。?

實際項目中，可以用一個rule目錄存放所有的告警規(guī)則，然后rule/*.yaml的方式配置

3. 配置Prometheus

在 prometheus.yml文件下，引用test-svc-alert-rule.yaml告警規(guī)則配置，并開啟 Alertmanager。

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # alertmanage default start port 9093
      - localhost:9093  
rule_files:
  - /data/prometheus-stack/prometheus/rule/*.yml

4. 測試

現(xiàn)在我們配置完成之后，熱加載一下Prometheus 的配置。然后嘗試觸發(fā)告警條件。

測試服務(wù)下線，把測試服務(wù)手動停掉

image.png

測試下單異常

image.png
在http://ip:9093界面可以看到觸發(fā)的告警

image.png

小結(jié)

到這里我們的Spring Boot 微服務(wù)監(jiān)控告警模塊也就算講述完畢了。希望能給你帶來一些收獲。

對應(yīng)的源碼可以Github上看到。

如果本文有幫助到你，希望能點個贊，這是對我的最大動力????????。

參考

Grafana全面瓦解
Grafana官方文檔
Prometheus 官方文檔
Prometheus Book
Prometheus 非官方中文手冊

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Spring Boot 微服務(wù)應(yīng)用集成Prometheus + Grafana 實現(xiàn)監(jiān)控告警

前言

理論部分

Prometheus

Grafana

Alertmananger

監(jiān)控Java 應(yīng)用

監(jiān)控模式

如何監(jiān)控

MicroMeter

實操部分一

一、添加依賴

Actuator 的 /prometheus端點

二、Prometheus 安裝與配置

配置Prometheus

測試

三、Grafana安裝和配置

1. 啟動

2. 登錄

3. 配置數(shù)據(jù)源

4. 創(chuàng)建監(jiān)控Dashboard

5. Dashboard 市場

6. 引入dashboard

實操部分二

一、自定義（業(yè)務(wù)）監(jiān)控指標(biāo)

1. 創(chuàng)建 Prometheus 監(jiān)控管理類PrometheusCustomMonitor

2. 新增/order接口

3. 新增全局異常處理器GlobalExceptionHandler

測試：

4. 在Grafana 中添加對應(yīng)監(jiān)控面板

二、添加監(jiān)控

1. 部署 Alertmanager

2. 設(shè)置告警規(guī)則

3. 配置Prometheus

4. 測試

小結(jié)

參考

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

一、添加依賴

Actuator 的 `/prometheus`端點

二、Prometheus 安裝與配置

三、Grafana安裝和配置

一、自定義（業(yè)務(wù)）監(jiān)控指標(biāo)

1. 創(chuàng)建 Prometheus 監(jiān)控管理類`PrometheusCustomMonitor`

2. 新增`/order`接口

3. 新增全局異常處理器`GlobalExceptionHandler`

二、添加監(jiān)控