漸進(jìn)式的Deployment–Argo Rollout

導(dǎo)語(yǔ):熟悉k8s的同學(xué)知道Deployment目前只支持RollingUpgrade和ReCreate兩種策略。而對(duì)于運(yùn)維的同學(xué)而言,實(shí)際生產(chǎn)環(huán)境中更多應(yīng)該使用灰度發(fā)布和藍(lán)綠部署,筆者本想嘗試造輪子,實(shí)現(xiàn)一個(gè)加強(qiáng)版,正好網(wǎng)上搜索到Argo-Rollout和我想法一致,就不用重復(fù)造輪子了,本文就是體驗(yàn)一下Argo-Rollout。

簡(jiǎn)介

Argo-Rollout是一個(gè)Kubernetes Controller和對(duì)應(yīng)一系列的CRD,提供更強(qiáng)大的Deployment能力。包括灰度發(fā)布、藍(lán)綠部署、更新測(cè)試(experimentation)、漸進(jìn)式交付(progressive delivery)等特性。

支持特性:

  • 藍(lán)綠部署
  • 灰度發(fā)布
  • 細(xì)粒的,帶權(quán)重的流量調(diào)度(traffic shifting)
  • 自動(dòng)rollback和promotion
  • 手動(dòng)管理
  • 可定制的metric查詢和kpi分析
  • Ingress controller集成:nginx,alb
  • Service Mesh集成:Istio,Linkerd,SMI
  • Metric provider集成:Prometheus, Wavefront, Kayenta, Web, Kubernetes Jobs

原理:

Argo原理和Deployment差不多,只是加強(qiáng)rollout的策略和流量控制。當(dāng)spec.template發(fā)送變化時(shí),Argo-Rollout就會(huì)根據(jù)spec.strategy進(jìn)行rollout,通常會(huì)產(chǎn)生一個(gè)新的ReplicaSet,逐步scale down之前的ReplicaSet的pod數(shù)量。

安裝

官方安裝文檔

1.安裝argo-rollouts的controller和crd

kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://raw.githubusercontent.com/argoproj/argo-rollouts/stable/manifests/install.yaml

2.安裝argo-rollouts的kubectl plugin

curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64
chmod +x ./kubectl-argo-rollouts-linux-amd64
mv ./kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts

使用

灰度發(fā)布包含Replica Shifting和Traffic Shifting兩個(gè)過程。

Replica Shifting

這里就直接拿官網(wǎng)的例子,來體驗(yàn)一下Replica Shifting。

1.部署一個(gè)Demo應(yīng)用

首先創(chuàng)建一個(gè)Rollout的CR和訪問該CR的Service:

kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/rollout.yaml
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/service.yaml

Rollout CR,可以看到除了apiVersion,kind以及strategy之外,其他和Deployment無(wú)異,實(shí)際上其源碼基本上都是引用的Deployment的數(shù)據(jù)結(jié)構(gòu):

#cat rollout.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  replicas: 5
  strategy:
    canary:
      steps:
      - setWeight: 20
      - pause: {}
      - setWeight: 40
      - pause: {duration: 10}
      - setWeight: 60
      - pause: {duration: 10}
      - setWeight: 80
      - pause: {duration: 10}
  revisionHistoryLimit: 2
  selector:
    matchLabels:
      app: rollouts-demo
  template:
    metadata:
      labels:
        app: rollouts-demo
    spec:
      containers:
      - name: rollouts-demo
        image: argoproj/rollouts-demo:blue
        ports:
        - name: http
          containerPort: 8080
          protocol: TCP
        resources:
          requests:
            memory: 32Mi
            cpu: 5m

暴露的service:

apiVersion: v1
kind: Service
metadata:
  name: rollouts-demo
spec:
  ports:
  - port: 80
    targetPort: http
    protocol: TCP
    name: http
  selector:
    app: rollouts-demo

可以使用Argo-Rollout提供的plugin查看其狀態(tài),感覺還是很香:

kubectl argo rollouts get rollout rollouts-demo
部署rollout

2.更新spec觸發(fā)rollout

然后通過修改spec中的鏡像,觸發(fā)一次rollout:

kubectl argo rollouts set image rollouts-demo rollouts-demo=argoproj/rollouts-demo:yellow

預(yù)期Rollout會(huì)創(chuàng)建一個(gè)新的ReplicaSet,并且逐步擴(kuò)容新的ReplicaSet和縮容舊的ReplicaSet,用plugin查看一下:

# kubectl argo rollouts get rollout rollouts-demo --watch
canary

可以看到Rollout新創(chuàng)建了ReplicaSet rollouts-demo-789746c88d,并且將老ReplicaSet的Pod轉(zhuǎn)移到新的ReplicaSet,新老ReplicaSet的pod比例為: 1:4,并且狀態(tài)為Paused,沒有繼續(xù)升級(jí)新pod,為什么呢?

主要原因就在這個(gè)spec.strategy,通過這個(gè)strategy我們可以看到其為升級(jí)設(shè)定了steps,由于是個(gè)列表,因此其會(huì)按照順序執(zhí)行。這里第一步就是setWeight:20,意味著需要將20%的pod更新為新版本;第二步動(dòng)作為pause: {},意味著將永久暫停,需要人為通過plugin使其繼續(xù):

  strategy:
    canary:
      steps:
      - setWeight: 20
      - pause: {}
      - setWeight: 40
      - pause: {duration: 10} #停頓10s
      - setWeight: 60
      - pause: {duration: 10}
      - setWeight: 80
      - pause: {duration: 10}

我們通過promote命令使其進(jìn)行下一步:

# kubectl argo rollouts promote rollouts-demo

讓我們?cè)俨榭唇Y(jié)果,所有pod都為新的ReplicaSet的pod:

finished

Traffic Shifting

上面例子演示了Argo-Rollout如何控制Replica Shifting,而正常的灰度過程,應(yīng)該包含Replica Shifting和Traffic Shifting兩部分。

目前Argo-Rollout主要集成了IngressServiceMesh兩種流量控制方法,我的測(cè)試環(huán)境中目前只部署了Nginx-Controller那就使用Ingress做演示。

1.部署物料

首先刪除之前的例子:

kubectl delete -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/rollout.yaml
kubectl delete -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/service.yaml

再部署官網(wǎng)的例子:

kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/rollout.yaml
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/services.yaml
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/nginx/ingress.yaml

上面的文件會(huì)部署1個(gè)rollout,兩個(gè)service和一個(gè)ingress:

Rollout里分別用canaryServicestableService分別定義了該應(yīng)用灰度的Service Name(rollouts-demo-canary)和當(dāng)前版本的Service Name(rollouts-demo-stable):

apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: rollouts-demo
spec:
  replicas: 1
  strategy:
    canary:
      canaryService: rollouts-demo-canary
      stableService: rollouts-demo-stable
      trafficRouting:
        nginx:
          stableIngress: rollouts-demo-stable
      steps:
      - setWeight: 5
      - pause: {}
...

Service rollouts-demo-canary 和 rollouts-demo-stable,二者內(nèi)容一樣。selector中暫時(shí)沒有填上pod-template-hash,Argo-Rollout Controller會(huì)根據(jù)實(shí)際的ReplicaSet hash來修改該值:

apiVersion: v1
kind: Service
metadata:
  name: rollouts-demo-canary
spec:
  ports:
  - port: 80
    targetPort: http
    protocol: TCP
    name: http
  selector:
    app: rollouts-demo
    # This selector will be updated with the pod-template-hash of the canary ReplicaSet. e.g.:
    # rollouts-pod-template-hash: 7bf84f9696

---
apiVersion: v1
kind: Service
metadata:
  name: rollouts-demo-stable
spec:
  ports:
  - port: 80
    targetPort: http
    protocol: TCP
    name: http
  selector:
    app: rollouts-demo
    # This selector will be updated with the pod-template-hash of the stable ReplicaSet. e.g.:
    # rollouts-pod-template-hash: 789746c88d

Ingress則定義了規(guī)則,nginx將rollouts-demo.local域名的請(qǐng)求轉(zhuǎn)發(fā)到當(dāng)前版本的Service(rollouts-demo-stable):

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: rollouts-demo-stable
  annotations:
    kubernetes.io/ingress.class: nginx
spec:
  rules:
  - host: rollouts-demo.local
    http:
      paths:
      - path: /
        backend:
          # Reference to a Service name, also specified in the Rollout spec.strategy.canary.stableService field
          serviceName: rollouts-demo-stable
          servicePort: 80

Rollout Controller會(huì)根據(jù)ingress rollouts-demo-stable內(nèi)容,自動(dòng)創(chuàng)建一個(gè)ingress用了灰度的流量,名字為<ROLLOUT-NAME>-<INGRESS-NAME>-canary,所以這里多了一個(gè)ingress rollouts-demo-rollouts-demo-stable-canary,將流量導(dǎo)向Canary Service(rollouts-demo-canary):

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  generation: 1
  name: rollouts-demo-rollouts-demo-stable-canary
  namespace: default
  ownerReferences:
  - apiVersion: argoproj.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Rollout
    name: rollouts-demo
    uid: 2d5b728b-2f71-4bf2-8283-323acf8ef573
spec:
  rules:
  - host: rollouts-demo.local
    http:
      paths:
      - backend:
          serviceName: rollouts-demo-canary
          servicePort: 80
        path: / 

2.觸發(fā)更新

kubectl argo rollouts set image rollouts-demo rollouts-demo=argoproj/rollouts-demo:yellow
kubectl argo rollouts get rollout rollouts-demo

可以看到Rollout狀態(tài)中SetWeight為5了

traffic

同時(shí)查看Ingress,多了nginx.ingress.kubernetes.io/canarynginx.ingress.kubernetes.io/canary-weight 兩條annotation:

#當(dāng)前版本Ingress

#灰度Ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/canary: "true"
    nginx.ingress.kubernetes.io/canary-weight: "5"
  creationTimestamp: "2020-07-05T06:31:54Z"
  generation: 1
  name: rollouts-demo-rollouts-demo-stable-canary

細(xì)心的你也能看出上面結(jié)果顯示有一個(gè)小問題問題ActualWeight:50,這里應(yīng)該為5或者95,所以順便提了個(gè)issue給社區(qū)。

總結(jié)

Argo-Rollout提供更加強(qiáng)大的Deployment,包含比較適合運(yùn)維的灰度發(fā)布和藍(lán)綠發(fā)布功能。本文也是簡(jiǎn)單體驗(yàn)了一下其灰度發(fā)布功能。

本文未提及的功能包括:

  1. Experiments,可以加入到Steps中,用于檢驗(yàn)每個(gè)Step是否符合用戶預(yù)期;
  2. Analysis,用于統(tǒng)計(jì)Rollout中的各種metrics,包括每個(gè)Step花費(fèi)時(shí)間等。

另外想到一個(gè)需求Argo-Rollout暫時(shí)未支持:

對(duì)于traffic-shifting,在做灰度的時(shí)候應(yīng)該是讓固定的一些用戶或者url流量到新版本,目前Argo-Rollout并不支持。

當(dāng)然上面這個(gè)問題可以通過添加一個(gè)Experiment,由該Experiment去修改Ingress或者SMI中的內(nèi)容來實(shí)現(xiàn)。

除去功能之外,從源碼學(xué)習(xí)的角度來說,Argo-Rollout仍然是一個(gè)好項(xiàng)目,結(jié)構(gòu)清晰,適合學(xué)習(xí)寫Controller和Plugin。
為什么代碼不是很復(fù)雜,而k8s自己不實(shí)現(xiàn)呢?可能是k8s為了鼓勵(lì)大家多寫crd吧,哈哈!

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容