Jaeger分布式跟蹤工具初探

官方文檔

Jaegertracing

Jaeger簡介

Jaeger:開源的端到端分布式跟蹤,監(jiān)視復(fù)雜的分布式系統(tǒng)中的事務(wù)并進行故障排除。
下圖對比了常用的開源全鏈路追蹤方案,目前SkyWalking和Pinpoint使用比較多,Jaeger相比客戶端支持語言比較多,特別是對C++的支持,所以這次選擇測試下。


Jaeger解決的問題

  • 分布式事務(wù)監(jiān)控
  • 性能和延遲優(yōu)化
  • 根本原因分析
  • 服務(wù)依賴性分析
  • 分布式上下文傳播

Jaeger架構(gòu)圖

Jaeger組件

  • Jaeger Agent,負責(zé)和客戶端通信,把收集到的追蹤信息上報個收集器 Jaeger Collector
  • Jaeger Colletor把收集到的數(shù)據(jù)存入數(shù)據(jù)庫或者其它存儲器
  • Jaeger Query 負責(zé)對追蹤數(shù)據(jù)進行查詢
  • Jaeger Ingester 是一個從Kafka主題讀取并寫入另一個存儲后端(Cassandra、Elasticsearch)的服務(wù)
  • Jaeger UI負責(zé)用戶交互

Jaeger端口統(tǒng)計

Agent
5775 UDP協(xié)議,接收兼容zipkin的協(xié)議數(shù)據(jù)
6831 UDP協(xié)議,接收兼容jaeger的兼容協(xié)議
6832 UDP協(xié)議,接收jaeger的二進制協(xié)議
5778 HTTP協(xié)議,數(shù)據(jù)量大不建議使用

Collector
14267 tcp agent發(fā)送jaeger.thrift格式數(shù)據(jù)
14250 tcp agent發(fā)送proto格式數(shù)據(jù)(背后gRPC)
14268 http 直接接受客戶端數(shù)據(jù)
14269 http 健康檢查

Query
16686 http jaeger的前端,放給用戶的接口
16687 http 健康檢查

Jaeger部署

1.創(chuàng)建命名空間

[root@VM-0-123-centos jaeger]# kubectl create namespace jaeger 

2.部署Jaeger-Operator
Jaeger Operator:Jaeger Operator for Kubernetes簡化了在Kubernetes上的部署和運行Jaeger。
Jaeger Operator是Kubernetes operator的實現(xiàn)。操作員是一種軟件,可以減輕運行另一軟件的操作復(fù)雜性。從技術(shù)上講,操作員是打包,部署和管理Kubernetes應(yīng)用程序的一種方法。
Jaeger Operator版本跟蹤Jaeger組件(查詢,收集器,代理)的一種版本。發(fā)行新版本的Jaeger組件時,將發(fā)行新版本的操作員,該操作員了解如何將先前版本的運行實例升級到新版本。

[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/crds/jaegertracing.io_jaegers_crd.yaml 
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/service_account.yaml
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role.yaml
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/role_binding.yaml
[root@VM-0-123-centos jaeger]# kubectl create -n jaeger -f https://raw.githubusercontent.com/jaegertracing/jaeger-operator/master/deploy/operator.yaml

查看狀態(tài)

[root@VM-0-123-centos jaeger]# kubectl get all -n jaeger
NAME                                         READY   STATUS        RESTARTS   AGE
pod/jaeger-operator-6ff67bdd4b-4nffk         1/1     Running       0          14d
pod/simple-prod-collector-59fc47bf5c-h26mq   0/1     Terminating   0          9d

NAME                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/jaeger-operator-metrics   ClusterIP   172.20.253.138   <none>        8383/TCP,8686/TCP   14d

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/jaeger-operator   1/1     1            1           14d

NAME                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/jaeger-operator-6ff67bdd4b   1         1         1       14d

3.創(chuàng)建jaeger實例
創(chuàng)建jaeger.yaml文件,配置ES集群及限制Deployment/simple-prod-collector容器的cpu和內(nèi)存使用大小。最大數(shù)量可以起10個pod。

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-prod
spec:
  strategy: production
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://10.0.16.3:9200
        index-prefix: zhjt
  collector:
    maxReplicas: 10
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
[root@VM-0-123-centos jaeger]# kubectl apply -f  jaeger.yaml  -n jaeger
jaeger.jaegertracing.io/simple-prod created

列出jaeger對象
備注:貌似使用官網(wǎng)all in one的例子狀態(tài)是正常的Running,這里狀態(tài)雖然是Failed,但是不影響使用。

[root@VM-0-123-centos jaeger]# kubectl get jaegers -n jaeger
NAME          STATUS   VERSION   STRATEGY     STORAGE         AGE
simple-prod   Failed   1.22.0    production   elasticsearch   9d

獲取pod名字

[root@VM-0-123-centos jaeger]# kubectl get pods -l app.kubernetes.io/instance=simple-prod -n jaeger
NAME                                              READY   STATUS      RESTARTS   AGE
simple-prod-collector-59fc47bf5c-h26mq            1/1     Running     0          9d
simple-prod-query-85689b7bbd-g5jw9                2/2     Running     0          9d

獲取pod日志

[root@VM-0-123-centos jaeger]# kubectl  logs simple-prod-query-85689b7bbd-g5jw9 jaeger-agent  -n jaeger
2021/04/28 04:55:34 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
{"level":"info","ts":1619585734.2081811,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1619585734.2082183,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
{"level":"info","ts":1619585734.2083232,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1619585734.2083883,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":14271"}
{"level":"info","ts":1619585734.2084124,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:14271","health-status":"unavailable"}
{"level":"info","ts":1619585734.2089527,"caller":"grpc/builder.go:70","msg":"Agent requested insecure grpc connection to collector(s)"}
{"level":"info","ts":1619585734.2089992,"caller":"grpc@v1.29.1/clientconn.go:243","msg":"parsed scheme: \"dns\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.21038,"caller":"command-line-arguments/main.go:84","msg":"Starting agent"}
{"level":"info","ts":1619585734.2104166,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"}
{"level":"info","ts":1619585734.2108943,"caller":"grpc/builder.go:108","msg":"Checking connection to collector"}
{"level":"info","ts":1619585734.210908,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"IDLE"}
{"level":"info","ts":1619585734.211061,"caller":"app/agent.go:69","msg":"Starting jaeger-agent HTTP server","http-port":5778}
{"level":"info","ts":1619585734.3344934,"caller":"grpc@v1.29.1/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[{172.20.0.88:14250  <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3345578,"caller":"grpc@v1.29.1/clientconn.go:667","msg":"ClientConn switching balancer to \"round_robin\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3345697,"caller":"grpc@v1.29.1/clientconn.go:682","msg":"Channel switches to new LB policy \"round_robin\"","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3346283,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.33467,"caller":"grpc@v1.29.1/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.88:14250\" to connect","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.334736,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3347983,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"}
{"level":"info","ts":1619585734.335669,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3357751,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[0xc0002f5ea0:{{172.20.0.88:14250  <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.3357947,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619585734.335807,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"READY"}
{"level":"info","ts":1619592172.4516647,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517512,"caller":"grpc@v1.29.1/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.88:14250\" to connect","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517596,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[]}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517772,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4517884,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"}
{"level":"warn","ts":1619592172.4523218,"caller":"grpc@v1.29.1/clientconn.go:1275","msg":"grpc: addrConn.createTransport failed to connect to {172.20.0.88:14250  <nil> 0 <nil>}. Err: connection error: desc = \"transport: Error while dialing dial tcp 172.20.0.88:14250: connect: connection refused\". Reconnecting...","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4523551,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.452386,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to TRANSIENT_FAILURE","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.4523947,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"TRANSIENT_FAILURE"}
{"level":"info","ts":1619592172.6118224,"caller":"grpc@v1.29.1/resolver_conn_wrapper.go:143","msg":"ccResolverWrapper: sending update to cc: {[{172.20.0.178:14250  <nil> 0 <nil>}] <nil> <nil>}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6118581,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6118758,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to SHUTDOWN","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.611892,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to CONNECTING","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6119003,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"CONNECTING"}
{"level":"info","ts":1619592172.6119049,"caller":"grpc@v1.29.1/clientconn.go:1193","msg":"Subchannel picks a new address \"172.20.0.178:14250\" to connect","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.612726,"caller":"grpc@v1.29.1/clientconn.go:1056","msg":"Subchannel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6127572,"caller":"base/balancer.go:200","msg":"roundrobinPicker: newPicker called with info: {map[0xc0003df970:{{172.20.0.178:14250  <nil> 0 <nil>}}]}","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6127682,"caller":"grpc@v1.29.1/clientconn.go:417","msg":"Channel Connectivity change to READY","system":"grpc","grpc_log":true}
{"level":"info","ts":1619592172.6127849,"caller":"grpc/builder.go:119","msg":"Agent collector connection state change","dialTarget":"dns:///simple-prod-collector-headless.jaeger.svc:14250","status":"READY"}
[root@VM-0-123-centos jaeger]# kubectl  logs simple-prod-query-85689b7bbd-g5jw9 jaeger-query   -n jaeger
2021/04/28 04:55:29 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined
{"level":"info","ts":1619585729.8951077,"caller":"flags/service.go:117","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1619585729.8951416,"caller":"flags/service.go:123","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
{"level":"info","ts":1619585729.8952546,"caller":"flags/admin.go:105","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1619585729.8953054,"caller":"flags/admin.go:111","msg":"Starting admin HTTP server","http-addr":":16687"}
{"level":"info","ts":1619585729.8953238,"caller":"flags/admin.go:97","msg":"Admin server started","http.host-port":"[::]:16687","health-status":"unavailable"}
{"level":"info","ts":1619585729.9169888,"caller":"config/config.go:183","msg":"Elasticsearch detected","version":7}
{"level":"info","ts":1619585729.9174955,"caller":"app/static_handler.go:181","msg":"UI config path not provided, config file will not be watched"}
{"level":"info","ts":1619585729.9175768,"caller":"app/server.go:170","msg":"Query server started"}
{"level":"info","ts":1619585729.9175944,"caller":"healthcheck/handler.go:128","msg":"Health Check state change","status":"ready"}
{"level":"info","ts":1619585729.9176183,"caller":"app/server.go:249","msg":"Starting GRPC server","port":16685,"addr":":16685"}
{"level":"info","ts":1619585729.9176335,"caller":"app/server.go:230","msg":"Starting HTTP server","port":16686,"addr":":16686"}

4.查看jaeger資源

[root@VM-0-123-centos jaeger]# kubectl get all -n jaeger
NAME                                                  READY   STATUS      RESTARTS   AGE
pod/jaeger-operator-6ff67bdd4b-4nffk                  1/1     Running     0          14d
pod/simple-prod-collector-59fc47bf5c-h26mq            1/1     Running     0          8d
pod/simple-prod-query-85689b7bbd-g5jw9                2/2     Running     0          8d
 
NAME                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                  AGE
service/jaeger-operator-metrics          ClusterIP   172.20.253.138   <none>        8383/TCP,8686/TCP                        14d
service/simple-prod-collector            ClusterIP   172.20.255.184   <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   8d
service/simple-prod-collector-headless   ClusterIP   None             <none>        9411/TCP,14250/TCP,14267/TCP,14268/TCP   8d
service/simple-prod-query                ClusterIP   172.20.254.102   <none>        16686/TCP                                8d
 
NAME                                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/jaeger-operator         1/1     1            1           14d
deployment.apps/simple-prod-collector   1/1     1            1           8d
deployment.apps/simple-prod-query       1/1     1            1           8d
 
NAME                                               DESIRED   CURRENT   READY   AGE
replicaset.apps/jaeger-operator-6ff67bdd4b         1         1         1       14d
replicaset.apps/simple-prod-collector-59fc47bf5c   1         1         1       8d
replicaset.apps/simple-prod-query-85689b7bbd       1         1         1       8d
 
NAME                                                        REFERENCE                          TARGETS             MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/simple-prod-collector   Deployment/simple-prod-collector   1457m/90, 137m/90   1         10        1          8d

如果流量大需要減小es壓力,可以接入kafka集群,修改jaeger.yaml文件

apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: simple-streaming
spec:
  strategy: streaming
  collector:
    options:
      kafka:
        producer:
          topic: jaeger-spans
          brokers: my-cluster-kafka-brokers.kafka:9092   #修改為kafka地址
  ingester:
    options:
      kafka:
        consumer:
          topic: jaeger-spans
          brokers: my-cluster-kafka-brokers.kafka:9092  #修改為kafka地址
      ingester:
        deadlockInterval: 5s
  storage:
    type: elasticsearch
    options:
      es:
        server-urls: http://elasticsearch:9200   #修改為ES地址

5.agent部署

jaeger client的一個代理程序,client將收集到的調(diào)用鏈數(shù)據(jù)發(fā)給agent,然后由agent發(fā)給collector。由于使用的udp協(xié)議,一般部署在靠近client的位置。

agent有多種安裝方式

1).docker安裝

下載:jaegertracing/jaeger-agent Tags (docker.com)

docker run -d -p 6831:6831/udp -p 6832:6832/udp -p 5778:5778/tcp jaegertracing/jaeger-agent:1.12 --reporter.grpc.host-port=xx.xx.xx.xx:14250

2).k8s安裝又分兩種

sidecar方式

daemonset方式

參考:Operator for Kubernetes — Jaeger documentation (jaegertracing.io)

3).二進制安裝

下載:Jaeger – Download Jaeger (jaegertracing.io)

nohup ./jaeger-agent --collector.host-port=xxxx:14267 1>1.log 2>2.log &

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

友情鏈接更多精彩內(nèi)容