1.說明
總體的設(shè)計思想主要來源于螞蟻金服的文章深度 | 螞蟻金服自動化運維大規(guī)模 Kubernetes 集群的實踐之路
實現(xiàn)中開發(fā)的Operator主要基于operator-sdk開發(fā)。
目的,通過元數(shù)據(jù)集群能夠?qū)I(yè)務(wù)集群進(jìn)行快速的上下線,管理業(yè)務(wù)集群的生命周期。在公有云上為租戶快速的創(chuàng)建回收集群等。
2.總體的架構(gòu)

3.主要的流程
-
元數(shù)據(jù)集群搭建(平臺部署操作)
作為整個容器云的支撐系統(tǒng),放在支撐區(qū),搭建過程kubeadm安裝kubernetes和手動安裝kubernetes注冊Master-Operator和Node-Operator到元數(shù)據(jù)集群。Master包含Etcd,Apiserver,scheduler,controller-manager;Node包含Kubelet和docker(可能是其他容器)
同時將節(jié)點的證書寫入到ConfigMap中,目的是Node-Operator在加入節(jié)點時候,從集群中拿到證書,分發(fā)給被加入的節(jié)點,當(dāng)然也可以放到配置中心。實際上ca的證書默認(rèn)已經(jīng)放到集群中,比如在configmap[extension-apiserver-authentication]中就有。
kubectl -n kube-system create configmap ca --from-file/etc/kubernetes/ca.pem kubectl -n kube-system create configmap bootstrap --from-file=/etc/kubernetes/bootstrap.conf 主機(jī)申請或配置(租戶操作)
如果有底層IAAS云管,可以通過IAAS API根據(jù)配置創(chuàng)建主機(jī),沒有云管你可以直接配置主機(jī)證書生成
根據(jù)提供的集群信息(主機(jī),POD_CIDR,SERVICE_CIDR等),生成集群的證書,并寫入到配置中心Nacos或者元數(shù)據(jù)集群的ConfigMap中,當(dāng)然有的證書大小開會超過1M,etcd中建議不要超過1M,但證書很少不會有太大影響。此處的證書生成工具,博主通過serverless的方式作為函數(shù)注冊到元數(shù)據(jù)集群中了,基于kubeless開發(fā),實際看需要,詳情參考serverless開發(fā)。-
通過元數(shù)據(jù)集群Node-Operator將控制節(jié)點加入到元數(shù)據(jù)集群作為元數(shù)據(jù)集群的node
在將節(jié)點加入的元數(shù)據(jù)集群的時候,需要給kubelet增加默認(rèn)的標(biāo)簽,用于在部署的時候濟(jì)進(jìn)行標(biāo)簽選擇,配置如下--node-labels=kubelet.kubernetes.io/tenant=租戶ID kubelet.kubernetes.io/cluster=集群名稱 \另外在搭建元數(shù)據(jù)集群時,將kubelet需要的證書放到ConfigMap中,此時在分發(fā)(Node-Operator內(nèi)部主要通過Ansible實現(xiàn))給kubelet
bootstrap := &corev1.ConfigMap{} err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bootstrap", Namespace: "kube-system"}, bootstrap) if err != nil && errors.IsNotFound(err) { ioutil.WriteFile("/etc/kubernetes/bootstrap.conf",[]byte(bootstrap.Data["bootstrap.conf"]),os.ModePerm) } ca:= &corev1.ConfigMap{} err = r.client.Get(context.TODO(), types.NamespacedName{Name: "ca", Namespace: "kube-system"}, ca) if err != nil && errors.IsNotFound(err) { ioutil.WriteFile("/etc/kubernetes/ca.pem",[]byte(clientCA.Data["ca.pem"]),os.ModePerm) }
部署其他容器組件,比如kube-proxy,監(jiān)控組件,采集組件等
-
通過Master-Operator在特定標(biāo)簽(租戶和集群名稱)的節(jié)點上部署Master
Etcd:通過Deployment實現(xiàn),增加標(biāo)簽選擇,使用HostNetwork方式。同時在InitContainers中增加一個容器,用于根據(jù)租戶和集群名稱等信息從元數(shù)據(jù)集群(或者配置中心)中獲取生成的集群證書,包括業(yè)務(wù)集群節(jié)點的證書bootstrap.conf和ce.pem,同時掛載到主機(jī)目錄/etc/kubernetes/pki下
func initContainers() []corev1.Container{ return []corev1.Container{ { Name: "certs", Image: "cloud.org/cert:latest", ImagePullPolicy: corev1.PullIfNotPresent, VolumeMounts:[]corev1.VolumeMount{ { Name: "cert", MountPath: "/etc/kubernetes/pki", }, }, }, } }后續(xù)為了容災(zāi)可以將Etcd通過operator實現(xiàn),coreos已提供。
Apiserver:通過Deployment實現(xiàn),增加標(biāo)簽選擇,使用HostNetwork方式
scheduler:通過Deployment實現(xiàn),增加標(biāo)簽選擇,使用HostNetwork方式
controller-manager:通過Deployment實現(xiàn),增加標(biāo)簽選擇,使用HostNetwork方式
啟動完成之后,還需要執(zhí)行如下動作:
①將業(yè)務(wù)集群中節(jié)點需要的證書,寫入到業(yè)務(wù)集群的configmap中
②給業(yè)務(wù)集群注冊Node-Operator
- 通過業(yè)務(wù)集群的Node-Operator將節(jié)點加入到業(yè)務(wù)集群
4.主要實現(xiàn)過程
4.1.開發(fā)環(huán)境
- 環(huán)境安裝( go git)
export GOROOT=/data/go
export GOPATH=/data/work/go
export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
#開啟mod支持
export GO111MODULE=on
#方便包下載
export GOPROXY=https://goproxy.cn
- operator-sdk安裝
#/data/work/go/src
[root@node1 src]# wget https://github.com/operator-framework/operator-sdk/releases/download/v0.11.0/operator-sdk-v0.11.0-x86_64-linux-gnu
[root@node1 src]# chmod +x operator-sdk-v0.11.0-x86_64-linux-gnu
[root@node1 src]# mv operator-sdk-v0.11.0-x86_64-linux-gnu operator-sdk
[root@node1 src]# mv operator-sdk /usr/local/bin
4.2.Master-Operator
[root@node operator]# operator-sdk new master --repo cloud.org/operator/master
INFO[0000] Creating new Go operator 'master'.
INFO[0000] Created go.mod
#此處省略......
INFO[0014] Project validation successful.
INFO[0014] Project creation complete.
[root@node master]# operator-sdk add api --api-version=crd.cloud.org/v1alpha1 --kind=KubeMaster
INFO[0000] Generating api version crd.cloud.org/v1alpha1 for kind KubeMaster.
INFO[0000] Created pkg/apis/crd/group.go
INFO[0012] Created pkg/apis/crd/v1alpha1/kubemaster_types.go
INFO[0012] Created pkg/apis/addtoscheme_crd_v1alpha1.go
INFO[0012] Created pkg/apis/crd/v1alpha1/register.go
INFO[0012] Created pkg/apis/crd/v1alpha1/doc.go
#此處省略......
deploy/crds/crd.cloud.org_kubemasters_crd.yaml
INFO[0026] Code-generation complete.
INFO[0026] API generation complete.
[root@node master]# operator-sdk add controller --api-version=crd.cloud.org/v1alpha1 --kind=KubeMaster
INFO[0000] Generating controller version crd.cloud.org/v1alpha1 for kind KubeMaster.
INFO[0000] Created pkg/controller/kubemaster/kubemaster_controller.go
INFO[0000] Created pkg/controller/add_kubemaster.go
INFO[0000] Controller generation complete.
創(chuàng)建deployment
// newPodForCR returns a busybox pod with the same name/namespace as the cr
func newDeploymentForCR(cr *cloudv1alpha1.MasterOperator) *appv1.Deployment {
labels := map[string]string{
"app": cr.Name,
}
nodeSelector := map[string]string{
"kubelet.kubernetes.io/tenant":cr.Name,//TODO:租戶+集群名稱
}
buffer := bytes.NewBufferString("")
etcdSize := len(cr.Spec.Etcds)
for index,node := range cr.Spec.Etcds {
buffer.WriteString("https://"+node.IP+":2379")
if etcdSize-1 != index {
buffer.WriteString(",")
}
}
matchLabels := map[string]string{
"app" : cr.Name + "-master",
}
if cr.Selector == nil {
cr.Selector = &metav1.LabelSelector{
MatchLabels: matchLabels,
}
}else if cr.Selector.MatchLabels == nil{
if cr.Selector.MatchLabels == nil{
cr.Selector.MatchLabels = matchLabels
}
}else {
cr.Selector.MatchLabels["app"] = cr.Name + "-master"
}
//TODO:將ETCD,Apiserver,Controller-Manager,Scheduler按POD維度部署,生產(chǎn)中建議按照Deployment維度
size := int32(len(cr.Spec.Masters))
return &appv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: cr.Name + "-master",
Namespace: cr.Namespace,
Labels: labels,
},
Spec: appv1.DeploymentSpec{
Selector: cr.Selector,
Replicas: &size,
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: cr.Selector.MatchLabels,
Name: cr.Name + "-master",
},
Spec: corev1.PodSpec{
NodeSelector: nodeSelector,
HostNetwork: true,
DNSPolicy: corev1.DNSClusterFirstWithHostNet,
//Priority: 2000000000,
PriorityClassName: "system-cluster-critical",
//負(fù)責(zé)下載證書
InitContainers: initContainers(),
Containers: assemblyContainers(buffer.String(),cr.Spec.ServiceCIDR,cr.Spec.PodCIDR),
Volumes: []corev1.Volume{
certVolume(),
localtimeVolume("localtime","/etc/localtime","file"),
},
},
},
},
}
}
4.3.Node-Operator
[root@node operator]# operator-sdk new node --repo cloud.org/operator/node
#此處省略......
INFO[0006] Project validation successful.
INFO[0006] Project creation complete.
[root@node node]# operator-sdk add api --api-version=crd.cloud.org/v1alpha1 --kind=KubeNode
#此處省略......
INFO[0011] Code-generation complete.
INFO[0011] API generation complete.
[root@node node]# operator-sdk add controller --api-version=crd.cloud.org/v1alpha1 --kind=KubeNode
INFO[0000] Generating controller version crd.cloud.org/v1alpha1 for kind KubeNode.
INFO[0000] Created pkg/controller/kubenode/kubenode_controller.go
INFO[0000] Created pkg/controller/add_kubenode.go
INFO[0000] Controller generation complete.
初始化Ansible配置文件
//1.初始化ansible hosts文件
err = initAnsibleHosts(instance.Spec.Nodes)
if err !=nil {
fmt.Println("初始化Ansible 主機(jī)異常")
return reconcile.Result{}, err
}
拉取節(jié)點所需證書
bootstrap := &corev1.ConfigMap{}
err = r.client.Get(context.TODO(), types.NamespacedName{Name: "bootstrap", Namespace: "kube-system"}, bootstrap)
if err != nil && errors.IsNotFound(err) {
ioutil.WriteFile("/etc/kubernetes/bootstrap.conf",[]byte(bootstrap.Data["bootstrap.conf"]),os.ModePerm)
}
clientCA := &corev1.ConfigMap{}
err = r.client.Get(context.TODO(), types.NamespacedName{Name: "extension-apiserver-authentication", Namespace: "kube-system"}, clientCA)
if err != nil && errors.IsNotFound(err) {
ioutil.WriteFile("/etc/kubernetes/ca.pem",[]byte(clientCA.Data["client-ca-file"]),os.ModePerm)
}
Ansible執(zhí)行的yaml
- hosts: kubernetes
remote_user: root
tasks:
- name: "1.創(chuàng)建工作目錄"
file:
path: "{{ item.path }}"
state: "{{ item.state }}"
with_items:
- { path: "{{WORK_PATH}}/", state: "directory" }
- { path: "/etc/kubernetes/pki", state: "directory" }
- { path: "/etc/kubernetes/manifests", state: "directory" }
- name: "2.拷貝安裝文件,后續(xù)可以放到下載中心,TODO:證書"
copy:
src: "{{ item.src }}"
dest: "{{ item.dest }}"
with_items:
- { src: "/assets/systemd/docker.service", dest: "/usr/lib/systemd/system/" }
- { src: "/assets/systemd/kubelet.service", dest: "/usr/lib/systemd/system/" }
- { src: "/assets/pki/ca.pem", dest: "/etc/kubernetes/pki" }
- { src: "/assets/pki/bootstrap.conf", dest: "/etc/kubernetes/" }
- { src: "/assets/script/kubelet.sh", dest: "/data/" }
- name: "3.安裝Node節(jié)點"
shell: sh /data/kubelet.sh {{WORK_PATH}} {{TENANT}}
- name: "4.清理安裝文件"
shell: rm -rf /data/kubelet.sh
執(zhí)行的腳本
#!/bin/bash
# ----------------------------------------------------------------------
# name: kubelet.sh
# version: 1.0
# createTime: 2019-06-25
# description: 初始化
# author: doublegao
# email: doublegao@gmail.com
# params: 工作目錄,tenant,dockerVersion,kubeletVersion
# example: kubelet.sh path tenant 18.09.9 v1.15.3
# ----------------------------------------------------------------------
WORK_PATH=$1
#去掉最后一個斜杠 "/"
WORK_PATH=${WORK_PATH%*/}
DOCKER_PATH=$WORK_PATH/docker
KUBELET_PATH=$WORK_PATH/kubelet
mkdir -p $DOCKER_PATH
mkdir -p $KUBELET_PATH
DOCKER_VERSION=18.09.9
KUBELET_VERSION=v1.15.3
echo "############# 1.系統(tǒng)初始化"
setenforce 0
sed -i "s/^SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
systemctl disable firewalld
systemctl stop firewalld
swapoff -a
sysctl -p
sed -i 's/.*swap.*/#&/' /etc/fstab
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
cat > /etc/sysctl.d/k8s.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
echo "############# 2.添加docker和kuberentes yum源"
wget -P /etc/yum.repos.d https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
echo "############# 3.檢查yum可用性"
yum repolistj
echo "############# 4.安裝啟用kubelet 和 docker"
yum install -y docker-ce-$DOCKER_VERSION-3.el7.x86_64 kubelet-$KUBELET_VERSION-0.x86_64
systemctl enable docker.service && systemctl enable kubelet.service
echo "############# 5.修改配置文件"
sed -i "s#_DATA_ROOT_#$DOCKER_PATH#g" /usr/lib/systemd/system/docker.service
sed -i "s#_HOSTNAME_#$HOSTNAME#g" /usr/lib/systemd/system/kubelet.service
sed -i "s#_KUBELET_PATH_#$KUBELET_PATH#g" /usr/lib/systemd/system/kubelet.service
sed -i "s#_TENANT_#$2#g" /usr/lib/systemd/system/kubelet.service
echo "############# 6.啟動docker,拉去pause鏡像"
systemctl daemon-reload && systemctl start docker.service
docker pull k8s.gcr.io/pause-amd64:3.1
echo "############# 7.啟動kubelet"
systemctl daemon-reload && systemctl start kubelet.service