菜鳥入門-03配置告警

簡介

引入了skywalking后,雖然界面可以清晰的看到鏈路情況,但是對于開發(fā)而言,更多的是在出現(xiàn)問題的時候我們才會主動去查詢鏈路信息,而skywalking提供了告警功能可以及時讓我們注意到問題。

告警主要有兩塊內容組成

告警使用

規(guī)則

# [Optional] Default, match all services in this metrics
    include-names:
      - dubbox-provider
      - dubbox-consumer
  • Threshold,目標值。比如,時間1000ms,成功率90
  • OP,> 大于, < 小雨, = 等于
  • Period,告警檢測周期
  • Count,數量
  • Silence period,沉默周期,如果告警在A時間觸發(fā),在A+sp時間內只會觸發(fā)一次告警,大家應該經歷過被已知告警轟炸的經歷,所以這個還是很有必要的

官方還給出了默認告警規(guī)則,這里就不做過多介紹了。

We provided a default alarm-setting.yml in our distribution only for convenience, which including following rules

  1. Service average response time over 1s in last 3 minutes.
  2. Service success rate lower than 80% in last 2 minutes.
  3. Service 90% response time is over 1s in last 3 minutes
  4. Service Instance average response time over 1s in last 2 minutes.
  5. Endpoint average response time over 1s in last 2 minutes.

鉤子

在上面有一篇文章介紹Webhook的內容。它主要就是我們日常告警中的一個回調功能。

Webhook requires the peer is a web container. The alarm message will send through HTTP post by application/json content type. The JSON format is based on List<org.apache.skywalking.oap.server.core.alarm.AlarmMessage> with following key information.

@Setter(AccessLevel.PUBLIC)
@Getter(AccessLevel.PUBLIC)
public class AlarmMessage {

    public static AlarmMessage NONE = new NoAlarm();

    private int scopeId;
    private String name;
    private int id0;
    private int id1;
    private String alarmMessage;
    private long startTime;

    private static class NoAlarm extends AlarmMessage {

    }
}

這里用到了lombok,個人覺得開源組件就不應該用lombok,也就多幾行Get/Set,所見即所得還是更符合人類習慣的。lombok它是屬于業(yè)務開發(fā)的蜜。

回歸正題,下面是發(fā)送的代碼

public class WebhookCallback implements AlarmCallback {
  
 @Override public void doAlarm(List<AlarmMessage> alarmMessage) {
        if (remoteEndpoints.size() == 0) {
            return;
        }

        CloseableHttpClient httpClient = HttpClients.custom().build();
        try {
            remoteEndpoints.forEach(url -> {
                HttpPost post = new HttpPost(url);
                post.setConfig(requestConfig);
                post.setHeader("Accept", "application/json");
                post.setHeader("Content-type", "application/json");

                StringEntity entity = null;
                try {
                    entity = new StringEntity(gson.toJson(alarmMessage));
                    post.setEntity(entity);
                    CloseableHttpResponse httpResponse = httpClient.execute(post);
                    StatusLine statusLine = httpResponse.getStatusLine();
                    if (statusLine != null && statusLine.getStatusCode() != 200) {
                        logger.error("send alarm to " + url + " failure. Response code: " + statusLine.getStatusCode());
                    }
                } catch (UnsupportedEncodingException e) {
                    logger.error("Alarm to JSON error, " + e.getMessage(), e);
                } catch (ClientProtocolException e) {
                    logger.error("send alarm to " + url + " failure.", e);
                } catch (IOException e) {
                    logger.error("send alarm to " + url + " failure.", e);
                }
            });
        } finally {
            try {
                httpClient.close();
            } catch (IOException e) {
                logger.error(e.getMessage(), e);
            }
        }
    }

}

而它又是org.apache.skywalking.oap.server.core.alarm.provider.AlarmCore#start觸發(fā)的,它是一個延遲線程池

Executors.newSingleThreadScheduledExecutor().scheduleAtFixedRate(() -> {}, 10, 10, TimeUnit.SECONDS);

頁面效果

我在dubbo服務端設置了隨機sleep,然后可以看到出現(xiàn)了告警信息


image.png

6.x 官方告警文檔

https://github.com/apache/skywalking/blob/master/docs/en/setup/backend/backend-alarm.md

?著作權歸作者所有,轉載或內容合作請聯(lián)系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

相關閱讀更多精彩內容

友情鏈接更多精彩內容