目錄
Calico簡(jiǎn)介
Calico 網(wǎng)絡(luò)Node之間通信網(wǎng)絡(luò)
Calico網(wǎng)絡(luò)模型主要工作組件
示例1:安裝calico
calicoctl命令安裝與使用
示例2:修改BGP網(wǎng)絡(luò)
Calico簡(jiǎn)介
Calico 是一種容器之間互通的網(wǎng)絡(luò)方案。在虛擬化平臺(tái)中,比如 OpenStack、Docker 等都需要實(shí)現(xiàn) workloads 之間互連,但同時(shí)也需要對(duì)容器做隔離控制,就像在 Internet 中的服務(wù)僅開放80端口、公有云的多租戶一樣,提供隔離和管控機(jī)制。而在多數(shù)的虛擬化平臺(tái)實(shí)現(xiàn)中,通常都使用二層隔離技術(shù)來實(shí)現(xiàn)容器的網(wǎng)絡(luò),這些二層的技術(shù)有一些弊端,比如需要依賴 VLAN、bridge 和隧道等技術(shù),其中 bridge 帶來了復(fù)雜性,vlan 隔離和 tunnel 隧道則消耗更多的資源并對(duì)物理環(huán)境有要求,隨著網(wǎng)絡(luò)規(guī)模的增大,整體會(huì)變得越加復(fù)雜。我們嘗試把 Host 當(dāng)作 Internet 中的路由器,同樣使用 BGP 同步路由,并使用 iptables 來做安全訪問策略,最終設(shè)計(jì)出了 Calico 方案。
設(shè)計(jì)思想:Calico 不使用隧道或 NAT 來實(shí)現(xiàn)轉(zhuǎn)發(fā),而是巧妙的把所有二三層流量轉(zhuǎn)換成三層流量,并通過 host 上路由配置完成跨 Host 轉(zhuǎn)發(fā)
設(shè)計(jì)優(yōu)勢(shì):
1.更優(yōu)的資源利用
二層網(wǎng)絡(luò)通訊需要依賴廣播消息機(jī)制,廣播消息的開銷與 host 的數(shù)量呈指數(shù)級(jí)增長(zhǎng),Calico 使用的三層路由方法,則完全抑制了二層廣播,減少了資源開銷。
2.可擴(kuò)展性
Calico 使用與 Internet 類似的方案,Internet 的網(wǎng)絡(luò)比任何數(shù)據(jù)中心都大,Calico 同樣天然具有可擴(kuò)展性。
3.簡(jiǎn)單而更容易 debug
因?yàn)闆]有隧道,意味著 workloads 之間路徑更短更簡(jiǎn)單,配置更少,在 host 上更容易進(jìn)行 debug 調(diào)試。
4.更少的依賴
Calico 僅依賴三層路由可達(dá)。
5.可適配性
Calico 較少的依賴性使它能適配所有 VM、Container、白盒或者混合環(huán)境場(chǎng)景。
Calico 網(wǎng)絡(luò)Node之間通信網(wǎng)絡(luò)
IPIP(可跨網(wǎng)段通信)
從字面來理解,就是把一個(gè)IP數(shù)據(jù)包又套在一個(gè)IP包里,即把 IP 層封裝到 IP 層的一個(gè) tunnel。它的作用其實(shí)基本上就相當(dāng)于一個(gè)基于IP層的網(wǎng)橋!一般來說,普通的網(wǎng)橋是基于mac層的,根本不需 IP,而這個(gè) ipip 則是通過兩端的路由做一個(gè) tunnel,把兩個(gè)本來不通的網(wǎng)絡(luò)通過點(diǎn)對(duì)點(diǎn)連接起來。
類似vxlan 但封裝開銷比vxlan小 效率相對(duì)更高一些,但安全性也更差
Vxlan(可跨網(wǎng)段通信)
與Flannel Vxlan原理相同
BGP(二層網(wǎng)絡(luò)通信)
邊界網(wǎng)關(guān)協(xié)議(Border Gateway Protocol, BGP)是互聯(lián)網(wǎng)上一個(gè)核心的去中心化自治路由協(xié)議。它通過維護(hù)IP路由表或‘前綴’表來實(shí)現(xiàn)自治系統(tǒng)(AS)之間的可達(dá)性,屬于矢量路由協(xié)議。BGP不使用傳統(tǒng)的內(nèi)部網(wǎng)關(guān)協(xié)議(IGP)的指標(biāo),而使用基于路徑、網(wǎng)絡(luò)策略或規(guī)則集來決定路由。因此,它更適合被稱為矢量性協(xié)議,而不是路由協(xié)議。BGP,通俗的講就是講接入到機(jī)房的多條線路(如電信、聯(lián)通、移動(dòng)等)融合為一體,實(shí)現(xiàn)多線單IP,BGP 機(jī)房的優(yōu)點(diǎn):服務(wù)器只需要設(shè)置一個(gè)IP地址,最佳訪問路由是由網(wǎng)絡(luò)上的骨干路由器根據(jù)路由跳數(shù)與其它技術(shù)指標(biāo)來確定的,不會(huì)占用服務(wù)器的任何系統(tǒng)
實(shí)際上,Calico 項(xiàng)目提供的 BGP 網(wǎng)絡(luò)解決方案,與 Flannel 的 host-gw 模式幾乎一樣。也就是說,Calico也是基于路由表實(shí)現(xiàn)容器數(shù)據(jù)包轉(zhuǎn)發(fā),但不同于Flannel使用flanneld進(jìn)程來維護(hù)路由信息的做法,而Calico項(xiàng)目使用BGP協(xié)議來自動(dòng)維護(hù)整個(gè)集群的路由信息。
部署推薦方案:
BGP+Vxlan
其中BGP在官方的推薦方案中 以50個(gè)節(jié)點(diǎn)為界區(qū)別了不同規(guī)模使用不同的部署方案
小規(guī)模網(wǎng)絡(luò):BGP peer 一對(duì)一網(wǎng)絡(luò):每個(gè)節(jié)點(diǎn)都是有N-1條路由,小型網(wǎng)絡(luò)適用,當(dāng)節(jié)點(diǎn)數(shù)N變多時(shí),路由表更新及AIP-SERVER都需要承受很大的壓力 類似網(wǎng)絡(luò)拓?fù)浣Y(jié)構(gòu)中的 網(wǎng)狀拓?fù)浣Y(jié)構(gòu)
大規(guī)模網(wǎng)絡(luò):BGP Reflector 路由反射器:選擇一到多個(gè)節(jié)點(diǎn)做為Reflector,所有節(jié)點(diǎn)路由都匯總給Reflector,所有節(jié)點(diǎn)都路由都指向Reflector ,適合大型網(wǎng)絡(luò),類似網(wǎng)絡(luò)拓?fù)浣Y(jié)構(gòu)中的星型網(wǎng)絡(luò)
Calico網(wǎng)絡(luò)模型主要工作組件:###
- Felix:運(yùn)行在每一臺(tái) Host 的 agent 進(jìn)程,主要負(fù)責(zé)網(wǎng)絡(luò)接口管理和監(jiān)聽、路由、ARP 管理、ACL 管理和同步、狀態(tài)上報(bào)等。
- etcd:分布式鍵值存儲(chǔ),主要負(fù)責(zé)網(wǎng)絡(luò)元數(shù)據(jù)一致性,確保Calico網(wǎng)絡(luò)狀態(tài)的準(zhǔn)確性,可以與kubernetes共用;
- BGP Client(BIRD):Calico 為每一臺(tái) Host 部署一個(gè) BGP Client,使用 BIRD 實(shí)現(xiàn),BIRD 是一個(gè)單獨(dú)的持續(xù)發(fā)展的項(xiàng)目,實(shí)現(xiàn)了眾多動(dòng)態(tài)路由協(xié)議比如 BGP、OSPF、RIP 等。在 Calico 的角色是監(jiān)聽 Host 上由 Felix 注入的路由信息,然后通過 BGP 協(xié)議廣播告訴剩余 Host 節(jié)點(diǎn),從而實(shí)現(xiàn)網(wǎng)絡(luò)互通。
- BGP Route Reflector:在大型網(wǎng)絡(luò)規(guī)模中,如果僅僅使用 BGP client 形成 mesh 全網(wǎng)互聯(lián)的方案就會(huì)導(dǎo)致規(guī)模限制,因?yàn)樗泄?jié)點(diǎn)之間倆倆互聯(lián),需要 N^2 個(gè)連接,為了解決這個(gè)規(guī)模問題,可以采用 BGP 的 Router Reflector 的方法,使所有 BGP Client 僅與特定 RR 節(jié)點(diǎn)互聯(lián)并做路由同步,從而大大減少連接數(shù)。
Calico有兩種運(yùn)行方式,
- 是讓calico/node獨(dú)立運(yùn)行于Kubernetes集群之外,但calico/kube-controllers依然需要以Pod資源運(yùn)行中集群之上;
- 是以CNI插件方式配置Calico完全托管運(yùn)行于Kubernetes集群之上,類似于我們前面曾經(jīng)部署托管Flannel網(wǎng)絡(luò)插件的方式。
對(duì)于后一種方式,Calico提供了在線的部署清單,它分別為50節(jié)點(diǎn)及以下規(guī)模和50節(jié)點(diǎn)以上規(guī)模的Kubernetes集群使用Kubernetes API作為Dabastore提供了不同的配置清單,也為使用獨(dú)立的etcd集群提供了專用配置清單。但這3種類型的配置清單中,Calico默認(rèn)啟用的是基于IPIP隧道的疊加網(wǎng)絡(luò),因而它會(huì)在所有流量上使用IPIP隧道而不是BGP路由。以下配置定義在部署清單中DaemonSet/calico-node資源的Pod模板中的calico-node容器之上。
配置選項(xiàng)
在IPv4類型的地址池上啟用的IPIP及其類型,支持3種可用值
Always(全局流量)、Cross-SubNet(跨子網(wǎng)流量)和Never3種可用值
name: CALICO_IPV4POOL_IPIP
value: "Always"是否在IPV4地址池上啟用VXLAN隧道協(xié)議,取值及意義與Flannel的VXLAN后端相同;但在全局流量啟用VXLAN時(shí)將完全不再需要BGP網(wǎng)絡(luò),建議將相關(guān)的組件禁用
name: CALICO_ IPV4POOL_VXLAN
value: "Never"需要注意的是,Calico分配的地址池需要同Ktbernetes集群的Pod網(wǎng)絡(luò)的定義保持一致。Pod網(wǎng)絡(luò)通常由kubeadm init初始化集群時(shí)使用--pod-network-cidr選項(xiàng)指定的網(wǎng)絡(luò),而Calico在其默認(rèn)的配置清單中默認(rèn)使用192.168.0.0/16作為Pod網(wǎng)絡(luò),因而部署Kubernetes集群時(shí)應(yīng)該規(guī)劃好要使用的網(wǎng)絡(luò)地址,并設(shè)定此二者相匹配。對(duì)于曾經(jīng)使用了flannel的默認(rèn)的10.244.0.0/16網(wǎng)絡(luò)的環(huán)境而言,我們也可以選擇修改資源清單中的定義,從而將其修改為其他網(wǎng)絡(luò)地址,它定義在DaemonSet/calico-node資源的Pod模板中的calico-node容器之上。
官網(wǎng)鏈接:
https://docs.projectcalico.org/getting-started/kubernetes/self-managed-onprem/onpremises
示例1:安裝calico
wget https://docs.projectcalico.org/manifests/calico.yaml
[root@k8s-master ~]# cd /etc/kubernetes/manifests/
[root@k8s-master manifests]# cat kube-controller-manager.yaml
...
System Info:
Machine ID: 32599e2a74704b2e95443e24ea15d4f6
System UUID: 34979a62-16de-4287-b149-2d4c2d8a70fb
Boot ID: f31de60e-4f89-4553-ba7a-99a46d049936
Kernel Version: 5.4.109-1.el7.elrepo.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.7
Kubelet Version: v1.19.9
Kube-Proxy Version: v1.19.9
PodCIDR: 10.244.1.0/24 #第個(gè)節(jié)點(diǎn)的地址塊都是由K8S分配
PodCIDRs: 10.244.1.0/24
Non-terminated Pods: (17 in total)
[root@k8s-master ~]# kubectl describe node k8s-node1
System Info:
Machine ID: 32599e2a74704b2e95443e24ea15d4f6
System UUID: 34979a62-16de-4287-b149-2d4c2d8a70fb
Boot ID: f31de60e-4f89-4553-ba7a-99a46d049936
Kernel Version: 5.4.109-1.el7.elrepo.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://20.10.7
Kubelet Version: v1.19.9
Kube-Proxy Version: v1.19.9
PodCIDR: 10.244.1.0/24 #每個(gè)Node Pod都是由K8S分配IP
PodCIDRs: 10.244.1.0/24
[root@k8s-master Network]# vim calico.yaml
...
"ipam": {
"type": "host-local",
"subnet": "usePodCidr" #使用k8s ipam插件分配地址
},
"policy": {
"type": "k8s"
},
...
- name: FELIX_WIREGUARDMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within `--cluster-cidr`.
- name: CALICO_IPV4POOL_CIDR
value: "10.244.0.0/16" #為了和之前的flannel 10.244.0.0/16適配
- name: CALICO_IPV4POOL_BLOCK_SIZE #添加這一行修改默認(rèn)塊大小
value: "24"
- name: USE_POD_CIDR #使用K8S的分配的IP地址,不然calico和K8S分配的地址會(huì)不一樣
value: "true"
- 安裝calico
[root@k8s-master plugin]# kubectl delete -f kube-flannel.yml
podsecuritypolicy.policy "psp.flannel.unprivileged" deleted
clusterrole.rbac.authorization.k8s.io "flannel" deleted
clusterrolebinding.rbac.authorization.k8s.io "flannel" deleted
serviceaccount "flannel" deleted
configmap "kube-flannel-cfg" deleted
daemonset.apps "kube-flannel-ds" deleted
[root@k8s-master plugin]# kubectl apply -f calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
poddisruptionbudget.policy/calico-kube-controllers created
- calico幾個(gè)組件
[root@k8s-master ~]# ps aux|grep calico
root 10867 0.0 0.1 112816 2156 pts/1 S+ 13:51 0:00 grep --color=auto calico
root 20680 0.0 2.3 1215184 35216 ? Sl 10:27 0:06 calico-node -allocate-tunnel-addrs
root 20681 0.0 2.1 1215184 32672 ? Sl 10:27 0:06 calico-node -monitor-addresses
root 20682 2.4 3.3 1510624 51636 ? Sl 10:27 4:54 calico-node -felix
root 20683 0.0 2.3 1657832 35496 ? Sl 10:27 0:09 calico-node -confd
root 20686 0.0 2.0 1214928 31628 ? Sl 10:27 0:05 calico-node -monitor-token
- 因?yàn)閏alico并沒有使用k8s的ipam分配IP,所以節(jié)點(diǎn)會(huì)有2個(gè)IP,一個(gè)是K8S分配的IP 一個(gè)是calico分配的IP
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.54.2 0.0.0.0 UG 101 0 0 eth4
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0 docker0
192.168.4.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0
192.168.12.0 192.168.4.172 255.255.255.0 UG 0 0 0 tunl0 #可以看到tunl0的路由信息
192.168.51.0 192.168.4.173 255.255.255.0 UG 0 0 0 tunl0 #同時(shí)可以看到節(jié)點(diǎn)的IP不像之前一定是連續(xù)的
192.168.54.0 0.0.0.0 255.255.255.0 U 101 0 0 eth4
192.168.113.0 192.168.4.171 255.255.255.0 UG 0 0 0 tunl0 #隧道接口
192.168.237.0 0.0.0.0 255.255.255.0 U 0 0 0 *
192.168.237.1 0.0.0.0 255.255.255.255 UH 0 0 0 cali7c0fb624285
192.168.237.2 0.0.0.0 255.255.255.255 UH 0 0 0 caliedaf285d4ef
192.168.237.3 0.0.0.0 255.255.255.255 UH 0 0 0 cali854da94d42a
[root@k8s-master calico]# ip route list
default via 192.168.54.2 dev eth4 proto static metric 101
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.4.0/24 dev eth0 proto kernel scope link src 192.168.4.170 metric 100
192.168.12.0/24 via 192.168.4.172 dev tunl0 proto bird onlink #可以看到tunl0的路由信息
192.168.51.0/24 via 192.168.4.173 dev tunl0 proto bird onlink
192.168.54.0/24 dev eth4 proto kernel scope link src 192.168.54.170 metric 101
192.168.113.0/24 via 192.168.4.171 dev tunl0 proto bird onlink
blackhole 192.168.237.0/24 proto bird
192.168.237.1 dev cali7c0fb624285 scope link
192.168.237.2 dev caliedaf285d4ef scope link
192.168.237.3 dev cali854da94d42a scope link
- 192.168.51.0/24 via 192.168.4.173 dev tunl0 proto bird onlink 下面的路由表可以看到 calico會(huì)為每個(gè)節(jié)點(diǎn)分配網(wǎng)絡(luò)地址段 并不是使用節(jié)點(diǎn)的網(wǎng)絡(luò)地址
[root@k8s-node1 ~]# ip route list
default via 192.168.54.2 dev eth4 proto static metric 101
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.4.0/24 dev eth0 proto kernel scope link src 192.168.4.171 metric 100
192.168.12.0/24 via 192.168.4.172 dev tunl0 proto bird onlink
192.168.51.0/24 via 192.168.4.173 dev tunl0 proto bird onlink
192.168.54.0/24 dev eth4 proto kernel scope link src 192.168.54.171 metric 101
blackhole 192.168.113.0/24 proto bird #黑洞 代表自己網(wǎng)段
192.168.237.0/24 via 192.168.4.170 dev tunl0 proto bird onlink
- 查看目前工作模式
[root@k8s-master calico]# kubectl api-resources
NAME SHORTNAMES APIGROUP NAMESPACED KIND
bgpconfigurations crd.projectcalico.org false BGPConfiguration
bgppeers crd.projectcalico.org false BGPPeer
blockaffinities crd.projectcalico.org false BlockAffinity
clusterinformations crd.projectcalico.org false ClusterInformation
felixconfigurations crd.projectcalico.org false FelixConfiguration
globalnetworkpolicies crd.projectcalico.org false GlobalNetworkPolicy
globalnetworksets crd.projectcalico.org false GlobalNetworkSet
hostendpoints crd.projectcalico.org false HostEndpoint
ipamblocks crd.projectcalico.org false IPAMBlock
ipamconfigs crd.projectcalico.org false IPAMConfig
ipamhandles crd.projectcalico.org false IPAMHandle
ippools crd.projectcalico.org false IPPool #calico地址池
kubecontrollersconfigurations crd.projectcalico.org false KubeControllersConfiguration
networkpolicies crd.projectcalico.org true NetworkPolicy
networksets crd.projectcalico.org true NetworkSet
[root@k8s-master calico]# kubectl get ippools -o yaml
....
spec:
blockSize: 24 #掩碼長(zhǎng)度
cidr: 192.168.0.0/16 #地址池
ipipMode: Always #可以看到目前為ipip模式
natOutgoing: true
nodeSelector: all()
- 訪問抓包
[root@k8s-master PodControl]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
deployment-demo-fb544c5d8-r7pc8 1/1 Running 0 8m3s 192.168.51.1 k8s-node3 <none> <none>
deployment-demo-fb544c5d8-splfr 1/1 Running 0 8m3s 192.168.12.1 k8s-node2 <none> <none>
[root@k8s-master PodControl]# kubectl exec deployment-demo-fb544c5d8-r7pc8 -it -- /bin/sh
[root@deployment-demo-fb544c5d8-r7pc8 /]# ifconfig
eth0 Link encap:Ethernet HWaddr 16:96:97:3F:F3:C5
inet addr:192.168.51.1 Bcast:192.168.51.1 Mask:255.255.255.255
UP BROADCAST RUNNING MULTICAST MTU:1480 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
[root@deployment-demo-fb544c5d8-r7pc8 /]# curl 192.168.12.1
iKubernetes demoapp v1.0 !! ClientIP: 192.168.51.1, ServerName: deployment-demo-fb544c5d8-splfr, ServerIP: 192.168.12.1!
[root@deployment-demo-fb544c5d8-r7pc8 /]# curl 192.168.12.1
iKubernetes demoapp v1.0 !! ClientIP: 192.168.51.1, ServerName: deployment-demo-fb544c5d8-splfr, ServerIP: 192.168.12.1!
[root@deployment-demo-fb544c5d8-r7pc8 /]# curl 192.168.12.1
[root@k8s-node2 ~]# tcpdump -i eth0 -nn ip host 192.168.4.172 and host 192.168.4.173
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:48:24.421003 IP 192.168.4.173 > 192.168.4.172: IP 192.168.51.1.33436 > 192.168.12.1.80: Flags [S], seq 3259804851, win 64800, options [mss 1440,sackOK,TS val 2008488248 ecr 0,nop,wscale 7], length 0 (ipip-proto-4)
11:48:24.421093 IP 192.168.4.172 > 192.168.4.173: IP 192.168.12.1.80 > 192.168.51.1.33436: Flags [S.], seq 3234480084, ack 3259804852, win 64260, options [mss 1440,sackOK,TS val 1053230437 ecr 2008488248,nop,wscale 7], length 0 (ipip-proto-4) #可以看到(ipip-proto-4)為IPIP模式
11:48:24.422305 IP 192.168.4.173 > 192.168.4.172: IP 192.168.51.1.33436 > 192.168.12.1.80: Flags [.], ack 1, win 507, options [nop,nop,TS val 2008488250 ecr 1053230437], length 0 (ipip-proto-4)
11:48:24.422308 IP 192.168.4.173 > 192.168.4.172: IP 192.168.51.1.33436 > 192.168.12.1.80: Flags [P.], seq 1:77, ack 1, win 507, options [nop,nop,TS val 2008488250 ecr 1053230437], length 76: HTTP: GET / HTTP/1.1 (ipip-proto-4)
11:48:24.422554 IP 192.168.4.172 > 192.168.4.173: IP 192.168.12.1.80 > 192.168.51.1.33436: Flags [.], ack 77, win 502, options [nop,nop,TS val 1053230439 ecr 2008488250], length 0 (ipip-proto-4)
11:48:24.431688 IP 192.168.4.172 > 192.168.4.173: IP 192.168.12.1.80 > 192.168.51.1.33436: Flags [P.], seq 1:18, ack 77, win 502, options [nop,nop,TS val 1053230447 ecr 2008488250], length 17: HTTP: HTTP/1.0 200 OK (ipip-proto-4)
11:48:24.432638 IP 192.168.4.172 > 192.168.4.173: IP 192.168.12.1.80 > 192.168.51.1.33436: Flags [FP.], seq 18:276, ack 77, win 502, options [nop,nop,TS val 1053230449 ecr 2008488250], length 258: HTTP (ipip-proto-4)
11:48:24.433660 IP 192.168.4.173 > 192.168.4.172: IP 192.168.51.1.33436 > 192.168.12.1.80: Flags [.], ack 18, win 507, options [nop,nop,TS val 2008488261 ecr 1053230447], length 0 (ipip-proto-4)
11:48:24.437531 IP 192.168.4.173 > 192.168.4.172: IP 192.168.51.1.33436 > 192.168.12.1.80: Flags [F.], seq 77, ack 277, win 505, options [nop,nop,TS val 2008488261 ecr 1053230449], length 0 (ipip-proto-4)
11:48:24.437775 IP 192.168.4.172 > 192.168.4.173: IP 192.168.12.1.80 > 192.168.51.1.33436: Flags [.], ack 78, win 502, options [nop,nop,TS val 1053230454 ecr 2008488261], length 0 (ipip-proto-4)
IP 192.168.4.172 > 192.168.4.173: IP 192.168.12.1.80 > 192.168.51.1.33436
- 可以看到默認(rèn)為ipip模式 也是經(jīng)過封裝在轉(zhuǎn)發(fā)的 和Flannel很類似,但相對(duì)Flannel經(jīng)過虛擬網(wǎng)橋CNI calico直接內(nèi)核(內(nèi)核的路由由 kube-proxy或IPVS生成)到在由tunl0傳輸想對(duì)Flannel少了一層交換機(jī)交換的過程,性能相比Flannel會(huì)快一些 但這并不是calico最佳的模式
calicoctl命令安裝與使用
calicoctl安裝的2種方式
第1種方式 calicoctl
https://docs.projectcalico.org/getting-started/clis/calicoctl/install
- 幾種方式運(yùn)行calicoctl 常用方式1:直接下載2進(jìn)制calicoctl 直接運(yùn)行
[root@k8s-master ~]# curl -o calicoctl -O -L "https://github.com/projectcalico/calicoctl/releases/download/v3.20.0/calicoctl"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 615 100 615 0 0 498 0 0:00:01 0:00:01 --:--:-- 504
100 43.2M 100 43.2M 0 0 518k 0 0:01:25 0:01:25 --:--:-- 920k
[root@k8s-master ~]# mv calicoctl /usr/bin/
[root@k8s-master ~]# chmod +x /usr/bin/calicoctl
[root@k8s-master ~]# calicoctl --help
Usage:
calicoctl [options] <command> [<args>...]
create Create a resource by file, directory or stdin.
replace Replace a resource by file, directory or stdin.
apply Apply a resource by file, directory or stdin. This creates a resource
if it does not exist, and replaces a resource if it does exists.
patch Patch a pre-exisiting resource in place.
delete Delete a resource identified by file, directory, stdin or resource type and
name.
get Get a resource identified by file, directory, stdin or resource type and
name.
label Add or update labels of resources.
convert Convert config files between different API versions.
ipam IP address management.
node Calico node management.
version Display the version of this binary.
export Export the Calico datastore objects for migration
import Import the Calico datastore objects for migration
datastore Calico datastore management.
- calicoctl 命令使用
- calicoctl 默認(rèn)會(huì)讀取 ~/.kube/下文件加載認(rèn)證信息,也可以通過配置文件指定認(rèn)證信息位置
[root@k8s-master calico]# mkdir /etc/calico/^C
[root@k8s-master calico]# cd /etc/calico/
[root@k8s-master calico]# cat calicoctl.cfg
apiVersion: projectcalico.org/v3
kind: CalicoAPIConfig
metadata:
spec:
datastoreType: "kubernetes"
kubeconfig: "/etc/kubernetes/admin.conf" #指定conf路徑
[root@k8s-master calico]#
[root@k8s-master calico]# kubectl get ippools
NAME AGE
default-ipv4-ippool 23h
[root@k8s-master calico]# calicoctl get ippool #可以用calicoctl直接訪問 calico資源
NAME CIDR SELECTOR
default-ipv4-ippool 192.168.0.0/16 all()
[root@k8s-master calico]# calicoctl get ippool default-ipv4-ippool -o yaml
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
creationTimestamp: "2021-08-29T14:33:53Z"
name: default-ipv4-ippool
resourceVersion: "1305"
uid: c01d73f3-c0c9-4674-b27e-725a1eaa5717
spec:
blockSize: 24
cidr: 192.168.0.0/16
ipipMode: Always
natOutgoing: true
nodeSelector: all()
vxlanMode: Never
[root@k8s-master calico]# calicoctl ipam --help
Usage:
calicoctl [options] [<args>...]
Options:
-h --help Show this screen.
-c --config=<config> Path to the file containing connection
configuration in YAML or JSON format.
[default: /etc/calico/calicoctl.cfg]
--context=<context> The name of the kubeconfig context to use.
-a
-A --all-namespaces
--as=<AS_NUM>
--backend=(bird|gobgp|none)
--dryrun
--export
--felix-config=<CONFIG>
-f --filename=<FILENAME>
--force
--from-report=<REPORT>
--ignore-validation
--init-system
--ip6-autodetection-method=<IP6_AUTODETECTION_METHOD>
--ip6=<IP6>
--ip-autodetection-method=<IP_AUTODETECTION_METHOD>
...
[root@k8s-master calico]# calicoctl ipam show
+----------+----------------+-----------+------------+--------------+
| GROUPING | CIDR | IPS TOTAL | IPS IN USE | IPS FREE |
+----------+----------------+-----------+------------+--------------+
| IP Pool | 192.168.0.0/16 | 65536 | 9 (0%) | 65527 (100%) |
+----------+----------------+-----------+------------+--------------+
[root@k8s-master calico]# calicoctl ipam show --show-blocks #每個(gè)地址段使用了多少個(gè)
+----------+------------------+-----------+------------+--------------+
| GROUPING | CIDR | IPS TOTAL | IPS IN USE | IPS FREE |
+----------+------------------+-----------+------------+--------------+
| IP Pool | 192.168.0.0/16 | 65536 | 9 (0%) | 65527 (100%) |
| Block | 192.168.113.0/24 | 256 | 1 (0%) | 255 (100%) |
| Block | 192.168.12.0/24 | 256 | 2 (1%) | 254 (99%) |
| Block | 192.168.237.0/24 | 256 | 4 (2%) | 252 (98%) |
| Block | 192.168.51.0/24 | 256 | 2 (1%) | 254 (99%) |
+----------+------------------+-----------+------------+--------------+
[root@k8s-master calico]# calicoctl ipam show --show-config #查看配置信息
+--------------------+-------+
| PROPERTY | VALUE |
+--------------------+-------+
| StrictAffinity | false |
| AutoAllocateBlocks | true |
| MaxBlocksPerHost | 0 |
+--------------------+-------+
第2種方式 以kubectl插件方式運(yùn)行
[root@k8s-master calico]# cp -p /usr/bin/calicoctl /usr/bin/kubectl-calico #把之前的文件改個(gè)名字就可以了
[root@k8s-master calico]# kubectl calico
Usage:
kubectl-calico [options] <command> [<args>...]
Invalid option: ''. Use flag '--help' to read about a specific subcommand
[root@k8s-master calico]# kubectl calico get nodes #和第1種方式相比加kubectl
NAME
k8s-master
k8s-node1
k8s-node2
k8s-node3
[root@k8s-master calico]# kubectl calico ipam show
+----------+----------------+-----------+------------+--------------+
| GROUPING | CIDR | IPS TOTAL | IPS IN USE | IPS FREE |
+----------+----------------+-----------+------------+--------------+
| IP Pool | 192.168.0.0/16 | 65536 | 9 (0%) | 65527 (100%) |
+----------+----------------+-----------+------------+--------------+
示例2:修改BGP網(wǎng)絡(luò)
#獲取現(xiàn)有配置在此基礎(chǔ)上修改
[root@k8s-master calico]# kubectl calico get ippool -o yaml > default-ipv4-ippool.yaml
[root@k8s-master calico]# cat default-ipv4-ippool.yaml
apiVersion: projectcalico.org/v3
kind: IPPool
metadata:
name: default-ipv4-ippool
spec:
blockSize: 24
cidr: 192.168.0.0/16
ipipMode: CrossSubnet #跨節(jié)點(diǎn)子網(wǎng)時(shí)使用IPIP 沒有跨子網(wǎng)使用BGP
natOutgoing: true
nodeSelector: all()
vxlanMode: Never #vxlanMode與ipipMode不能同時(shí)打開 必須有一個(gè)為Never
#通過ipipMode、vxlanMode不同選項(xiàng)可以使calico運(yùn)行在純GBP、ipip、vxlanMode或混合模式下
#如:ipipMode: Never vxlanMode: Never 為純BGP模式 ipipMode: Never vxlanMode: CrossSubnet 為BGP+vxlan模式
[root@k8s-master calico]# calicoctl apply -f default-ipv4-ippool.yaml
Successfully applied 1 'IPPool' resource(s)
#在來看路由信息 已經(jīng)沒有之前的tunl0 直接從節(jié)點(diǎn)網(wǎng)絡(luò)出去
[root@k8s-master calico]# ip route list
default via 192.168.54.2 dev eth4 proto static metric 101
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.4.0/24 dev eth0 proto kernel scope link src 192.168.4.170 metric 100
192.168.12.0/24 via 192.168.4.172 dev eth0 proto bird
192.168.51.0/24 via 192.168.4.173 dev eth0 proto bird #已經(jīng)沒有之前的tunl0隧道
192.168.54.0/24 dev eth4 proto kernel scope link src 192.168.54.170 metric 101
192.168.113.0/24 via 192.168.4.171 dev eth0 proto bird
blackhole 192.168.237.0/24 proto bird
192.168.237.1 dev cali7c0fb624285 scope link
192.168.237.2 dev caliedaf285d4ef scope link
192.168.237.3 dev cali854da94d42a scope link
[root@k8s-master calico]#
抓包測(cè)試
[root@k8s-master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
deployment-demo-fb544c5d8-r7pc8 1/1 Running 0 10h 192.168.51.1 k8s-node3 <none> <none>
deployment-demo-fb544c5d8-splfr 1/1 Running 0 10h 192.168.12.1 k8s-node2 <none> <none>
#從節(jié)點(diǎn)3訪問節(jié)點(diǎn)3
[root@k8s-master calico]# kubectl exec deployment-demo-fb544c5d8-r7pc8 -it -- /bin/sh
[root@deployment-demo-fb544c5d8-r7pc8 /]# curl 192.168.12.1
iKubernetes demoapp v1.0 !! ClientIP: 192.168.51.1, ServerName: deployment-demo-fb544c5d8-splfr, ServerIP: 192.168.12.1!
[root@deployment-demo-fb544c5d8-r7pc8 /]# curl 192.168.12.1
iKubernetes demoapp v1.0 !! ClientIP: 192.168.51.1, ServerName: deployment-demo-fb544c5d8-splfr, ServerIP: 192.168.12.1!
#直接抓Pod IP的包 因?yàn)闆]有封裝 所以是Pod IP直接通信 沒有外層IP
[root@k8s-node2 ~]# tcpdump -i eth0 -nn ip host 192.168.51.1 and host 192.168.12.1
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
22:11:54.704770 IP 192.168.51.1.33464 > 192.168.12.1.80: Flags [S], seq 4075444778, win 64800, options [mss 1440,sackOK,TS val 2045898534 ecr 0,nop,wscale 7], length 0
22:11:54.705866 IP 192.168.12.1.80 > 192.168.51.1.33464: Flags [S.], seq 402120893, ack 4075444779, win 64260, options [mss 1440,sackOK,TS val 1090640722 ecr 2045898534,nop,wscale 7], length 0
22:11:54.706670 IP 192.168.51.1.33464 > 192.168.12.1.80: Flags [.], ack 1, win 507, options [nop,nop,TS val 2045898537 ecr 1090640722], length 0
22:11:54.707077 IP 192.168.51.1.33464 > 192.168.12.1.80: Flags [P.], seq 1:77, ack 1, win 507, options [nop,nop,TS val 2045898537 ecr 1090640722], length 76: HTTP: GET / HTTP/1.1
22:11:54.707132 IP 192.168.12.1.80 > 192.168.51.1.33464: Flags [.], ack 77, win 502, options [nop,nop,TS val 1090640723 ecr 2045898537], length 0
22:11:54.737231 IP 192.168.12.1.80 > 192.168.51.1.33464: Flags [P.], seq 1:18, ack 77, win 502, options [nop,nop,TS val 1090640754 ecr 2045898537], length 17: HTTP: HTTP/1.0 200 OK
22:11:54.738439 IP 192.168.51.1.33464 > 192.168.12.1.80: Flags [.], ack 18, win 507, options [nop,nop,TS val 2045898568 ecr 1090640754], length 0
22:11:54.739117 IP 192.168.12.1.80 > 192.168.51.1.33464: Flags [P.], seq 18:155, ack 77, win 502, options [nop,nop,TS val 1090640755 ecr 2045898568], length 137: HTTP
22:11:54.739630 IP 192.168.12.1.80 > 192.168.51.1.33464: Flags [FP.], seq 155:276, ack 77, win 502, options [nop,nop,TS val 1090640756 ecr 2045898568], length 121: HTTP
22:11:54.739810 IP 192.168.51.1.33464 > 192.168.12.1.80: Flags [.], ack 155, win 506, options [nop,nop,TS val 2045898570 ecr 1090640755], length 0
[root@k8s-master calico]# calicoctl node status
Calico process is running.
IPv4 BGP status #可以看到已經(jīng)BGP模式了 這里看到是除去自己其它的3個(gè)節(jié)點(diǎn)
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 192.168.4.171 | node-to-node mesh | up | 02:27:59 | Established |
| 192.168.4.172 | node-to-node mesh | up | 02:27:58 | Established |
| 192.168.4.173 | node-to-node mesh | up | 02:27:58 | Established |
+---------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
- 到目前為止 如果是小規(guī)模的集群 比如50臺(tái)以下 就可以直接使用了
- 如果是大規(guī)模集群 部署reflector路由反射器,避免過多的路由表更新 減輕AIP-SERVER壓力
#把maseter配置成reflector節(jié)點(diǎn)
[root@k8s-master calico]# cat reflector-node.yaml
apiVersion: projectcalico.org/v3
kind: Node
metadata:
labels:
route-reflector: true
name: k8s-master #節(jié)點(diǎn)名
spec:
bgp:
ipv4Address: 192.168.4.170/24 #Master IP
ipv4IPIPTunnelAddr: 192.168.237.0 #tunl0網(wǎng)絡(luò)地址
routeReflectorClusterID: 1.1.1.1 #ID信息 如果有多個(gè)node 不能和其它重復(fù)就行
[root@k8s-master calico]# calicoctl apply -f reflector-node.yaml
Successfully applied 1 'Node' resource(s)
- 配置所有節(jié)點(diǎn)與reflector節(jié)點(diǎn)通信
[root@k8s-master calico]# cat bgppeer-demo.yaml
kind: BGPPeer
apiVersion: projectcalico.org/v3
metadata:
name: bgppeer-demo
spec:
nodeSelector: all() #所有節(jié)點(diǎn)
peerSelector: route-reflector=="true" #與有這個(gè)標(biāo)簽的節(jié)點(diǎn)通信
[root@k8s-master calico]# calicoctl apply -f bgppeer-demo.yaml
Successfully applied 1 'BGPPeer' resource(s)
[root@k8s-master calico]# calicoctl node status
Calico process is running.
IPv4 BGP status
+---------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+-------------------+-------+----------+-------------+
| 192.168.4.171 | node-to-node mesh | up | 02:27:59 | Established |#之前的mesh工作模式還在
| 192.168.4.172 | node-to-node mesh | up | 02:27:58 | Established |
| 192.168.4.173 | node-to-node mesh | up | 02:27:58 | Established |
| 192.168.4.171 | node specific | start | 14:36:40 | Idle |#基于reflector工作模式
| 192.168.4.172 | node specific | start | 14:36:40 | Idle |
| 192.168.4.173 | node specific | start | 14:36:40 | Idle |
+---------------+-------------------+-------+----------+-------------+
IPv6 BGP status
#關(guān)掉mesh 點(diǎn)對(duì)點(diǎn)的工作模式
[root@k8s-master calico]# cat default-bgpconfiguration.yaml
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
name: default
spec:
logSeverityScreen: Info
nodeToNodeMeshEnabled: false #是赤允許點(diǎn)對(duì)點(diǎn)通信
asNumber : 63400
[root@k8s-master calico]# calicoctl apply -f default-bgpconfiguration.yaml
Successfully applied 1 'BGPConfiguration' resource(s)
[root@k8s-master calico]# calicoctl node status
Calico process is running.
IPv4 BGP status
+---------------+---------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+---------------+---------------+-------+----------+-------------+
| 192.168.4.171 | node specific | up | 14:45:26 | Established |
| 192.168.4.172 | node specific | up | 14:45:26 | Established |
| 192.168.4.173 | node specific | up | 14:45:26 | Established |
+---------------+---------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.