歡迎轉(zhuǎn)載,轉(zhuǎn)載請(qǐng)標(biāo)明原文地址:http://www.itdecent.cn/p/3de558d8b57a
一、環(huán)境
OS:Centos7.7
Docker:18.09.9
Kubernetes:1.17.0
kubeadm:1.17.0
kubelet-1.17.0
kubectl-1.17.0
注意 K8s與Docker版本的兼容問題,在Kubernetes 1.14后支持的docker版本更新了——The list of validated docker versions has changed. 1.11.1 and 1.12.1 have been removed. The current list is 1.13.1, 17.03, 17.06, 17.09, 18.06, 18.09.
kubeadm提供了對(duì)Master的高可用部署方案,到Kubernetes 1.13版本時(shí)Kubeadm達(dá)到GA穩(wěn)定階段,這表明kubeadm不僅能夠快速部署一個(gè)符合一致性要求的Kubernetes集群,更具備足夠的彈性,能夠支持多種實(shí)際生產(chǎn)需求。在Kubernetes 1.14版本中又加入了方便證書傳遞的
--experimental-upload-certs參數(shù),減少了安裝過程中的大量證書復(fù)制工作。
kubeadm提供了兩種不同的高可用方案。
-
堆疊方案:etcd服務(wù)和控制平面被部署在同樣的節(jié)點(diǎn)中,對(duì)基礎(chǔ)設(shè)施的要求較低,對(duì)故障的應(yīng)對(duì)能力也較低
堆疊方案
最小三個(gè)Master(也稱工作平面),因?yàn)镋tcd使用RAFT算法選主,節(jié)點(diǎn)數(shù)量需要為2n+1個(gè)。
-
外置etcd方案:etcd和控制平面被分離,需要更多的硬件,也有更好的保障能力
外置etcd方案
二、服務(wù)器
| 系統(tǒng) | IP | 節(jié)點(diǎn)角色 | CPU | Memory | Hostname |
|---|---|---|---|---|---|
| Centos7.7 | 192.168.1.201 | Master | 2 | 4 | k8s-01 |
| Centos7.7 | 192.168.1.202 | Master | 2 | 4 | k8s-02 |
| Centos7.7 | 192.168.1.203 | Master | 2 | 4 | k8s-03 |
| Centos7.7 | 192.168.1.204 | Node | 4 | 6 | k8s-04 |
| Centos7.7 | 192.168.1.205 | Node | 4 | 6 | k8s-05 |
| Centos7.7 | 192.168.1.251 | LVS | 1 | 1 | lvs-master |
| Centos7.7 | 192.168.1.252 | LVS | 1 | 1 | lvs-backup |
額外的VIP:192.168.1.200
k8s集群默認(rèn)不允許將Pod副本調(diào)度到Master節(jié)點(diǎn)上,因此Master節(jié)點(diǎn)配置比Node節(jié)點(diǎn)配置低一些,k8s的節(jié)點(diǎn)最低配置要求2核4G內(nèi)存,低于這個(gè)配置集群部分組件無(wú)法運(yùn)行。
LVS當(dāng)然也可以部署在k8s的節(jié)點(diǎn)機(jī)器上,為了保證集群高可用,建議還是部署在單獨(dú)的機(jī)器上。
下面采用的是kubeadm的堆疊方案搭建k8s集群,也就是說如果3臺(tái)Master宕了2臺(tái)時(shí),集群將不可用,可能收到如下錯(cuò)誤信息"
Error from server: etcdserver: request timed out"。
三、系統(tǒng)設(shè)置(所有主機(jī))
1. 靜態(tài)IP
每個(gè)節(jié)點(diǎn)都配置為靜態(tài)ip
# 查看網(wǎng)卡名稱
$ ip a
# 進(jìn)入網(wǎng)卡目錄并修改網(wǎng)卡配置
$ cd /etc/sysconfig/network-scripts/ && vi ifcfg-<網(wǎng)卡名稱>
BOOTPROTO="static"
NM_CONTROLLED="no"
IPADDR=192.168.1.201
NETMASK=255.255.255.0
GATEWAY=192.168.1.1
# 保存退出并重啟網(wǎng)卡
$ systemctl restart network
# 設(shè)置網(wǎng)絡(luò)管理器配置
$ vi /etc/NetworkManager/NetworkManager.conf
dns=none
# 配置dns服務(wù)器
$ vi /etc/resolv.conf
nameserver 114.114.114.114
nameserver 8.8.8.8
2. 主機(jī)名
每個(gè)節(jié)點(diǎn)的主機(jī)名必須不同,并且保證每個(gè)節(jié)點(diǎn)間可以通過主機(jī)名訪問
# 查看主機(jī)名
$ hostname
# 修改主機(jī)名
$ hostnamectl set-hostname <your_hostname>
# 配置host,使所有節(jié)點(diǎn)之間可以通過hostname互相訪問
$ vi /etc/hosts
192.168.1.201 k8s-01
192.168.1.202 k8s-02
192.168.1.203 k8s-03
192.168.1.204 k8s-04
192.168.1.205 k8s-05
3. 安裝依賴
# 更新yum
$ yum update -y
# 安裝依賴包
$ yum install -y conntrack ipvsadm ipset jq sysstat curl iptables libseccomp bind-utils
4. 關(guān)閉防火墻、swap、selinux、dnsmasq、重置iptables
# 關(guān)閉防火墻
$ systemctl stop firewalld && systemctl disable firewalld
# 清理防火墻規(guī)則,設(shè)置默認(rèn)轉(zhuǎn)發(fā)策略
$ iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT
# 關(guān)閉swap
$ swapoff -a
# 在/etc/fstab中刪除swap的掛載
$ sed -i '/swap/s/^\(.*\)$/#\1/g' /etc/fstab
# 關(guān)閉selinux,讓容器可以讀取主機(jī)文件系統(tǒng)
$ setenforce 0
# 關(guān)閉dnsmasq(否則可能導(dǎo)致docker容器無(wú)法解析域名)
$ service dnsmasq stop && systemctl disable dnsmasq
關(guān)閉dnsmasq出現(xiàn)
Failed to stop dnsmasq.service: Unit dnsmasq.service not loaded.錯(cuò)誤,不需要理會(huì)。
5. 系統(tǒng)參數(shù)設(shè)置
# 開啟ipvs模塊
$ cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
modprobe br_netfilter
EOF
# 生效文件
$ chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4
# 制作配置文件
$ cat > /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
net.ipv4.tcp_tw_recycle=1 # 表示開啟TCP連接中TIME-WAIT sockets的快速回收,默認(rèn)為0,表示關(guān)閉。
net.ipv4.tcp_keepalive_time=600 # 超過這個(gè)時(shí)間沒有數(shù)據(jù)傳輸,就開始發(fā)送存活探測(cè)包
net.ipv4.tcp_keepalive_intvl=15 # keepalive探測(cè)包的發(fā)送間隔
net.ipv4.tcp_keepalive_probes=3 # 如果對(duì)方不予應(yīng)答,探測(cè)包的發(fā)送次數(shù)
vm.swappiness=0 # 禁止使用 swap 空間,只有當(dāng)系統(tǒng) OOM 時(shí)才允許使用它
vm.overcommit_memory=1 # 不檢查物理內(nèi)存是否夠用
vm.panic_on_oom=0 # 開啟 OOM
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
EOF
# 生效文件
$ sysctl -p /etc/sysctl.d/kubernetes.conf
# 調(diào)整系統(tǒng) TimeZone
$ timedatectl set-timezone Asia/Shanghai
# 將當(dāng)前的 UTC 時(shí)間寫入硬件時(shí)鐘
$ timedatectl set-local-rtc 0
# 重啟依賴于系統(tǒng)時(shí)間的服務(wù)
$ systemctl restart rsyslog && systemctl restart crond
# 關(guān)閉無(wú)關(guān)的服務(wù)
$ systemctl stop postfix && systemctl disable postfix
# 設(shè)置 rsyslogd 和 systemd journald
$ mkdir /var/log/journal # 持久化保存日志的目錄
$ mkdir /etc/systemd/journald.conf.d
$ cat > /etc/systemd/journald.conf.d/99-prophet.conf <<EOF
[Journal]
# 持久化保存到磁盤
Storage=persistent
# 壓縮歷史日志
Compress=yes
SyncIntervalSec=5m
RateLimitInterval=30s
RateLimitBurst=1000
# 最大占用空間 10G
SystemMaxUse=10G
# 單日志文件最大 200M
SystemMaxFileSize=200M
# 日志保存時(shí)間 2 周
MaxRetentionSec=2week
# 不將日志轉(zhuǎn)發(fā)到 syslog
ForwardToSyslog=no
EOF
# 重啟服務(wù)
$ systemctl restart systemd-journald
四、安裝docker(所有節(jié)點(diǎn))
這里使用rpm方式安裝docker
# 創(chuàng)建docker目錄
$ mkdir -p /opt/kubernetes/docker && cd /opt/kubernetes/docker
$ wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-18.09.9-3.el7.x86_64.rpm
$ wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/docker-ce-cli-18.09.9-3.el7.x86_64.rpm
$ wget https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm
# 清理老版本docker
$ yum remove -y docker* container-selinux
# 安裝rpm包
$ yum localinstall -y *.rpm
# 啟動(dòng)docker
$ systemctl start docker
# 開機(jī)啟動(dòng)
$ systemctl enable docker
# 查看分區(qū)存儲(chǔ)空間【可選】
$ df -h
文件系統(tǒng) 容量 已用 可用 已用% 掛載點(diǎn)
devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 2.0G 12M 2.0G 1% /run
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/mapper/centos-root 71G 1.7G 66G 3% /
/dev/sda1 976M 136M 773M 15% /boot
/dev/mapper/centos-home 20G 45M 19G 1% /home
tmpfs 394M 0 394M 0% /run/user/0
# 設(shè)置graph 修改docker的數(shù)據(jù)存儲(chǔ)目錄(默認(rèn)是/var/lib/docker)【可選】
# 設(shè)置cgroup driver(默認(rèn)是cgroupfs,目的是與kubelet配置統(tǒng)一,這里也可以不設(shè)置后面在kubelet中指定使用cgroupfs)
$ mkdir /docker-data
$ cat <<E0F > /etc/docker/daemon.json
{
"graph":"/docker-data",
"exec-opts": ["native.cgroupdriver=systemd"]
}
E0F
# 重啟docker
$ systemctl restart docker
五、安裝必要工具(區(qū)分節(jié)點(diǎn))
- kubeadm:部署集群用的命令(所有節(jié)點(diǎn))
- kubelet:在集群中每臺(tái)機(jī)器上都要運(yùn)行的組件,負(fù)責(zé)管理pod、容器的生命周期,負(fù)責(zé)主節(jié)點(diǎn)和工作節(jié)點(diǎn)之間通信的過程(所有節(jié)點(diǎn))
- kubectl:集群管理工具(可選,只要在控制集群的節(jié)點(diǎn)上安裝即可)
# 安裝網(wǎng)絡(luò)工具,最小化安裝系統(tǒng)后網(wǎng)絡(luò)工具需要手動(dòng)安裝
$ yum -y install net-tools
# 配置yum源(科學(xué)上網(wǎng)的同學(xué)可以把"mirrors.aliyun.com"替換為"packages.cloud.google.com")
$ cat <<E0F > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
E0F
# 安裝工具
# 找到要安裝的版本號(hào)
$ yum list kubeadm --showduplicates | sort -r
# 安裝指定版本(這里用的是1.17.0-0)
$ yum install -y kubeadm-1.17.0-0 kubelet-1.17.0-0 kubectl-1.17.0-0 --disableexcludes=kubernetes
# 設(shè)置kubelet的cgroupdriver(kubelet的cgroupdriver默認(rèn)為systemd,如果上面沒有設(shè)置docker的exec-opts為systemd,這里就需要將kubelet的設(shè)置為cgroupfs)【此處不需要執(zhí)行,上面docker已經(jīng)將cgroupdriver=systemd】
$ sed -i "s/cgroup-driver=systemd/cgroup-driver=cgroupfs/g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# 開啟自啟動(dòng)kubelet
$ systemctl enable kubelet && systemctl start kubelet
六、安裝LVS負(fù)載均衡器與keepalived高可用軟件
從前文的架構(gòu)圖中可以看到,所有節(jié)點(diǎn)都需要通過負(fù)載均衡器和API Server進(jìn)行通信,負(fù)載均衡器就非常重要了。這里考慮負(fù)載均衡器的性能與高可用,我們選擇了LVS + keepalived。
lvs-master(192.168.1.251)
# 安裝依賴
$ yum install -y ipvsadm wget curl gcc openssl-devel libnl3-devel net-snmp-devel libnfnetlink-devel
# 安裝keepalived,centos7通過yum下載的版本有問題,會(huì)報(bào)一個(gè)叫【TCP socket bind failed. Rescheduling】的錯(cuò)誤
$ wget http://www.keepalived.org/software/keepalived-1.4.5.tar.gz && tar -zxvf keepalived-1.4.5.tar.gz && cd keepalived-1.4.5 && ./configure && make && make install && cd .. && rm -f keepalived-1.4.5.tar.gz && rm -rf keepalived-1.4.5
################ keepalived負(fù)載均衡配置 ################
# 生成keepalived配置
$ cd /etc/keepalived && cat <<E0F > /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id keepalived-master
}
vrrp_instance vip_1 {
state MASTER
! 注意這是網(wǎng)卡名稱,使用ip a命令查看自己的局域網(wǎng)網(wǎng)卡名稱
interface ens33
! keepalived主備router_id必須一致
virtual_router_id 88
! 優(yōu)先級(jí),keepalived主節(jié)點(diǎn)優(yōu)先級(jí)要比備節(jié)點(diǎn)高
priority 100
advert_int 3
! 配置虛擬ip地址
virtual_ipaddress {
192.168.1.200
}
}
virtual_server 192.168.1.200 6443 {
delay_loop 6
lb_algo rr
lb_kind DR
persistence_timeout 0
protocol TCP
real_server 192.168.1.201 6443 {
weight 1
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 6443
}
}
real_server 192.168.1.202 6443 {
weight 1
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 6443
}
}
real_server 192.168.1.203 6443 {
weight 1
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 6443
}
}
}
E0F
# 啟動(dòng)keepalived
$ systemctl enable keepalived && service keepalived start
# 檢查keepalived狀態(tài)
$ service keepalived status
# 查看日志
$ journalctl -f -u keepalived
# 查看虛擬ip
$ ip a
################ real_server配置,也就是每個(gè)Master節(jié)點(diǎn)機(jī)器 ################
# 創(chuàng)建rs腳本
$ mkdir -p /opt/rs/ && cd /opt/rs && cat <<E0F > /opt/rs/rs.sh
#!/bin/bash
# 虛擬ip
vip=192.168.1.200
# 停止以前的lo:0
ifconfig lo:0 down
echo "1" > /proc/sys/net/ipv4/ip_forward
echo "0" > /proc/sys/net/ipv4/conf/all/arp_announce
# 啟動(dòng)一個(gè)回環(huán)地址并綁定給vip
ifconfig lo:0 \$vip broadcast \$vip netmask 255.255.255.255 up
route add -host \$vip dev lo:0
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
# ens33是主網(wǎng)卡名
echo "1" >/proc/sys/net/ipv4/conf/ens33/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/ens33/arp_announce
E0F
# 添加執(zhí)行權(quán)限
$ chmod +x /opt/rs/rs.sh
# 執(zhí)行rs腳本(如果出現(xiàn)錯(cuò)誤,重新執(zhí)行一遍即可)
$ ./rs.sh
# 添加到開機(jī)啟動(dòng)
$ echo '/opt/rs/rs.sh' >> /etc/rc.d/rc.local
# 在centos7中,/etc/rc.d/rc.local的權(quán)限被降低了,所以需要執(zhí)行如下命令賦予其可執(zhí)行權(quán)限
$ chmod +x /etc/rc.d/rc.local
lvs-backup(192.168.1.252)
# 安裝依賴
$ yum install -y ipvsadm wget curl gcc openssl-devel libnl3-devel net-snmp-devel libnfnetlink-devel
# 安裝keepalived,centos7通過yum下載的版本有問題,會(huì)報(bào)一個(gè)叫【TCP socket bind failed. Rescheduling】的錯(cuò)誤
$ wget http://www.keepalived.org/software/keepalived-1.4.5.tar.gz && tar -zxvf keepalived-1.4.5.tar.gz && cd keepalived-1.4.5 && ./configure && make && make install && cd .. && rm -f keepalived-1.4.5.tar.gz && rm -rf keepalived-1.4.5
################ keepalived負(fù)載均衡配置 ################
# 生成keepalived配置
$ cd /etc/keepalived && cat <<E0F > /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id keepalived-backup
}
vrrp_instance vip_1 {
state BACKUP
! 注意這是網(wǎng)卡名稱,使用ip a命令查看自己的局域網(wǎng)網(wǎng)卡名稱
interface ens33
! keepalived主備router_id必須一致
virtual_router_id 88
! 優(yōu)先級(jí),keepalived主節(jié)點(diǎn)優(yōu)先級(jí)要比備節(jié)點(diǎn)高
priority 99
advert_int 3
! 配置虛擬ip地址
virtual_ipaddress {
192.168.1.200
}
}
virtual_server 192.168.1.200 6443 {
delay_loop 6
lb_algo rr
lb_kind DR
persistence_timeout 0
protocol TCP
real_server 192.168.1.201 6443 {
weight 1
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 6443
}
}
real_server 192.168.1.202 6443 {
weight 1
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 6443
}
}
real_server 192.168.1.203 6443 {
weight 1
TCP_CHECK {
connect_timeout 10
nb_get_retry 3
delay_before_retry 3
connect_port 6443
}
}
}
E0F
# 啟動(dòng)keepalived
$ systemctl enable keepalived && service keepalived start
# 檢查keepalived狀態(tài)
$ service keepalived status
# 查看日志
$ journalctl -f -u keepalived
# 查看虛擬ip
$ ip a
七、 kubeadm搭建集群(區(qū)分節(jié)點(diǎn))
Master節(jié)點(diǎn):
k8s-01(192.168.1.201)
首個(gè)Master節(jié)點(diǎn)的安裝步驟會(huì)長(zhǎng)一些,后面的節(jié)點(diǎn)相對(duì)簡(jiǎn)單。
################## kubeadm配置 ##################
# 生成kubeadm配置文件(kubeadm將使用配置文件來創(chuàng)建k8s)
$ cd /opt/kubernetes/ && cat <<E0F > /opt/kubernetes/kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
# k8s的版本號(hào),必須跟安裝的Kubeadm版本等保持一致,否則啟動(dòng)報(bào)錯(cuò)
kubernetesVersion: v1.17.0
# docker鏡像倉(cāng)庫(kù)地址,k8s.gcr.io需要翻墻才可以下載鏡像,這里使用鏡像服務(wù)器下載http://mirror.azure.cn/help/gcr-proxy-cache.html
imageRepository: gcr.azk8s.cn/google_containers
# 集群名稱
clusterName: kubernetes
# apiServer的集群訪問地址,填寫vip地址即可 #
controlPlaneEndpoint: "192.168.1.200:6443"
networking:
# pod的網(wǎng)段
podSubnet: 10.10.0.0/16
serviceSubnet: 10.96.0.0/12
dnsDomain: cluster.local
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# kube-proxy模式指定為ipvs,需要提前在節(jié)點(diǎn)上安裝ipvs的依賴并開啟相關(guān)模塊
mode: ipvs
E0F
################## kubeadm創(chuàng)建節(jié)點(diǎn) ##################
# 初始化節(jié)點(diǎn)
# --upload-certs將自動(dòng)上傳證書到其他的Master節(jié)點(diǎn),省去了手動(dòng)復(fù)制證書的步驟
$ kubeadm init --config=kubeadm-config.yaml --upload-certs
# 上述命令執(zhí)行完畢后可以看到如下輸出,重點(diǎn)是kubeadm join命令,上面一條是master節(jié)點(diǎn)加入當(dāng)前集群的命令,下面一條是node節(jié)點(diǎn)加入當(dāng)前集群的命令,注意token的有效期是24h ps.如果沒有記住命令也沒關(guān)系,使用 kubeadm token list 命令可以看到生成的token列表;如果token失效,可以使用 kubeadm token create 命令生成新的token:
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \
--discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b \
--control-plane --certificate-key 1531224cc400b3cbe29fdff9f411b45452f25f229d31bf5e9d15771a0feae0c6
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \
--discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b
################## kubectl ##################
# 修改配置讓kubectl可用
$ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 此時(shí)可以查看一下k8s的pods運(yùn)行狀態(tài)
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-6955765f44-6wfmw 0/1 Pending 0 64m
kube-system coredns-6955765f44-wxdz5 0/1 Pending 0 64m
kube-system etcd-k8s-01 1/1 Running 0 64m
kube-system kube-apiserver-k8s-01 1/1 Running 0 64m
kube-system kube-controller-manager-k8s-01 1/1 Running 0 64m
kube-system kube-proxy-4t6qd 1/1 Running 0 64m
kube-system kube-scheduler-k8s-01 1/1 Running 0 64m
# 可以看到coredns的pod狀態(tài)仍然是pending,這是由于沒有安裝網(wǎng)絡(luò)插件的原因
################## 網(wǎng)絡(luò)插件配置 ##################
# 可選的網(wǎng)絡(luò)插件:https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network
# 安裝calico網(wǎng)絡(luò)插件
$ wget -P /opt/kubernetes https://docs.projectcalico.org/v3.8/manifests/calico.yaml && cd /opt/kubernetes
# 編輯calico.yaml文件,修改其中的網(wǎng)段改為前面指定的集群網(wǎng)段10.10.0.0/16,默認(rèn)calico的網(wǎng)段是192.168.0.0/16,在vi編輯器中鍵入 \192.168.0.0 快速定位修改處
$ vi calico.yaml
> - name: CALICO_IPV4POOL_CIDR
> value: "10.10.0.0/16"
# 安裝網(wǎng)絡(luò)插件
$ kubectl apply -f calico.yaml
# 再次檢查pods的狀態(tài)【coredns與calico的初始化時(shí)間較長(zhǎng),需要等待一會(huì)狀態(tài)才會(huì)變?yōu)镽unning】
$ kubectl get pods --all-namespaces
kube-system calico-kube-controllers-5c45f5bd9f-fx2xc 1/1 Running 0 110s
kube-system calico-node-k7t8m 1/1 Running 0 110s
kube-system coredns-6955765f44-6wfmw 1/1 Running 0 83m
kube-system coredns-6955765f44-wxdz5 1/1 Running 0 83m
kube-system etcd-k8s-01 1/1 Running 0 84m
kube-system kube-apiserver-k8s-01 1/1 Running 0 84m
kube-system kube-controller-manager-k8s-01 1/1 Running 0 84m
kube-system kube-proxy-4t6qd 1/1 Running 0 83m
kube-system kube-scheduler-k8s-01 1/1 Running 0 84m
k8s-02(192.168.1.202)
################## kubeadm加入集群 ##################
# 在第一個(gè)Master節(jié)點(diǎn)創(chuàng)建完畢后,打印了Master節(jié)點(diǎn)加入集群的命令,這里只需要直接使用命令即可創(chuàng)建節(jié)點(diǎn)并加入集群
$ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \
--discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b \
--control-plane --certificate-key 1531224cc400b3cbe29fdff9f411b45452f25f229d31bf5e9d15771a0feae0c6
# 上述命令執(zhí)行完畢后可以看到如下輸出
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
################## kubectl ##################
# 修改配置讓kubectl可用
$ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 此時(shí)可以查看一下k8s的pods運(yùn)行狀態(tài)
$ kubectl get pods --all-namespaces
k8s-03(192.168.1.203)
################## kubeadm加入集群 ##################
# 在第一個(gè)Master節(jié)點(diǎn)創(chuàng)建完畢后,打印了Master節(jié)點(diǎn)加入集群的命令,這里只需要直接使用命令即可創(chuàng)建節(jié)點(diǎn)并加入集群
$ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \
--discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b \
--control-plane --certificate-key 1531224cc400b3cbe29fdff9f411b45452f25f229d31bf5e9d15771a0feae0c6
# 上述命令執(zhí)行完畢后可以看到如下輸出
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
################## kubectl ##################
# 修改配置讓kubectl可用
$ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 此時(shí)可以查看一下k8s的pods運(yùn)行狀態(tài)
$ kubectl get pods --all-namespaces
Node節(jié)點(diǎn)
k8s-04(192.168.1.204)
################## kubeadm加入集群 ##################
# 在第一個(gè)Master節(jié)點(diǎn)創(chuàng)建完畢后,打印了node節(jié)點(diǎn)加入集群的命令,這里只需要直接使用命令即可創(chuàng)建節(jié)點(diǎn)并加入集群
$ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \
--discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b
################## kubectl ##################
# 修改配置讓kubectl可用
$ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 此時(shí)可以查看一下k8s的pods運(yùn)行狀態(tài)
$ kubectl get pods --all-namespaces
k8s-05(192.168.1.205)
################## kubeadm加入集群 ##################
# 在第一個(gè)Master節(jié)點(diǎn)創(chuàng)建完畢后,打印了node節(jié)點(diǎn)加入集群的命令,這里只需要直接使用命令即可創(chuàng)建節(jié)點(diǎn)并加入集群
$ kubeadm join 192.168.1.200:6443 --token ibpbcn.lr1bfb6ctmq87nxd \
--discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b
################## kubectl ##################
# 修改配置讓kubectl可用
$ mkdir -p $HOME/.kube && sudo cp -i /etc/kubernetes/kubelet.conf $HOME/.kube/config && sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 此時(shí)可以查看一下k8s的pods運(yùn)行狀態(tài)
$ kubectl get pods --all-namespaces
驗(yàn)證可用
執(zhí)行下面的命令,驗(yàn)證Kubernetes集群的相關(guān)Pod是否都正常創(chuàng)建并可用:
# 驗(yàn)證service名稱能否被coreDNS正確解析(10.96.0.10是默認(rèn)cluster ip網(wǎng)段下的coredns的地址,可以通過kubectl get svc -n kube-system查看dns的地址)
$ dig kubernetes.default.svc.cluster.local @10.96.0.10
# 查看已安裝的組件狀態(tài)
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5c45f5bd9f-6szxb 1/1 Running 0 4m17s
kube-system calico-node-5c9hh 1/1 Running 0 3m15s
kube-system calico-node-8ffhc 1/1 Running 0 58s
kube-system calico-node-cxndf 1/1 Running 0 53s
kube-system calico-node-glqpl 1/1 Running 0 2m22s
kube-system calico-node-q5blx 1/1 Running 0 4m17s
kube-system coredns-6955765f44-8bcz5 1/1 Running 0 5m14s
kube-system coredns-6955765f44-w4p4l 1/1 Running 0 5m14s
kube-system etcd-k8s-01 1/1 Running 0 5m26s
kube-system etcd-k8s-02 1/1 Running 0 3m7s
kube-system etcd-k8s-03 1/1 Running 0 2m21s
kube-system kube-apiserver-k8s-01 1/1 Running 0 5m26s
kube-system kube-apiserver-k8s-02 1/1 Running 0 3m15s
kube-system kube-apiserver-k8s-03 1/1 Running 1 86s
kube-system kube-controller-manager-k8s-01 1/1 Running 1 5m26s
kube-system kube-controller-manager-k8s-02 1/1 Running 0 3m15s
kube-system kube-controller-manager-k8s-03 1/1 Running 0 97s
kube-system kube-proxy-7sh6f 1/1 Running 0 53s
kube-system kube-proxy-dvf9j 1/1 Running 0 58s
kube-system kube-proxy-hbc5v 1/1 Running 0 2m22s
kube-system kube-proxy-pqhn6 1/1 Running 0 5m14s
kube-system kube-proxy-rltqg 1/1 Running 0 3m15s
kube-system kube-scheduler-k8s-01 1/1 Running 1 5m26s
kube-system kube-scheduler-k8s-02 1/1 Running 0 3m14s
kube-system kube-scheduler-k8s-03 1/1 Running 0 97s
如果發(fā)現(xiàn)有狀態(tài)錯(cuò)誤的Pod,則可以執(zhí)行kubectl -n=kube-system describe pod <pod_name>來查看錯(cuò)誤原因,常見的錯(cuò)誤原因是鏡像沒有下載完成。
至此,通過kubeadm工具就實(shí)現(xiàn)了Kubernetes集群的快速搭建。如果安裝失敗,則可以執(zhí)行kubeadm reset命令將主機(jī)恢復(fù)原狀,重新執(zhí)行kubeadm init命令,再次進(jìn)行安裝。
八、部署Dashboard管理k8s集群
安裝
# 下載編排文件
$ wget -P /etc/kubernetes/addons https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-rc1/aio/deploy/recommended.yaml && cd /etc/kubernetes/addons
# 刪除編排文件中不使用的服務(wù)Metric Server,刪除dashboard-metrics-scraper Service與Deployment。
$ vi recommended.yaml
# 部署服務(wù),打印如下日志
$ kubectl apply -f recommended.yaml
namespace/kubernetes-dashboard created
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created
# 查看服務(wù)運(yùn)行狀態(tài)
$ kubectl get namespaces
$ kubectl get deployments -n kubernetes-dashboard
$ kubectl get pods --namespace kubernetes-dashboard -o wide
$ kubectl get services -n kubernetes-dashboard
# 創(chuàng)建ServiceAccount,具有集群管理員權(quán)限
$ cat <<E0F > dashboard-adminuser.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
E0F
# 部署服務(wù)
$ kubectl apply -f dashboard-adminuser.yaml
# 獲取認(rèn)證Bearer Token
$ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}') | grep -E '^token' | awk '{print $2}'
訪問
為瀏覽器安裝kubernetes集群的根證書,前文采用kubeadm安裝的集群,根證書位于Master節(jié)點(diǎn)機(jī)器的/etc/kubernetes/pki/ca.crt

瀏覽器訪問,打開登錄頁(yè)面,這里使用API Server方式訪問dashboard,更多方式可以參考官方說明

選擇Token認(rèn)證,粘貼上面獲取到的token,登錄即可。

九、常見問題
1. Kubeadm證書有效期1年問題
kubeadm 是 kubernetes 提供的一個(gè)初始化集群的工具,使用起來非常方便,但是它創(chuàng)建的 apiserver、controller-manager 等證書默認(rèn)只有一年的有效期,同時(shí) kubelet 證書也只有一年有效期,一年之后 kubernetes 將停止服務(wù)。
方法總結(jié)下來有以下幾個(gè):
1、官方推薦:一年之內(nèi) kubeadm upgrade 更新一次 kubernetes 系統(tǒng)。
2、坊間方法:源代碼編譯,使得 kubeadm 生成的證書時(shí)間邊長(zhǎng)。
3、手動(dòng)更新證書( kubeadm alpha phase )。
4、啟用自動(dòng)輪換 kubelet 證書
自動(dòng)續(xù)約證書
自動(dòng)續(xù)訂指的是,在用kubeadm升級(jí)控制平面時(shí) 自動(dòng)更新所有證書。
如果對(duì)證書續(xù)約沒有要求,并定期升級(jí)kubernetes版本,每次升級(jí)間隔時(shí)間少于1年,最佳做法是經(jīng)常升級(jí)集群以確保安全。
如果不想在升級(jí)集群時(shí)續(xù)約證書,則給 kubeadm upgrade apply 或 kubeadm upgrade node 傳遞參數(shù):--certificate-renewal=false
手動(dòng)更新證書
在每個(gè)控制平面執(zhí)行,每個(gè)Master除了CA根證書一樣,其余證書都是由CA證書生成(每個(gè)節(jié)點(diǎn)主機(jī)名不同,因此證書的SAN值也不同):
# 檢查集群證書狀態(tài),該命令顯示了 所有證書的到期/剩余時(shí)間,包括在etc/kubernetes/pki目錄下的客戶端證書及由kubeadm嵌入到KUBECONFIG文件中的客戶端證書(admin.conf,controller-manager.conf和scheduler.conf)。
# 注意:kubelet.conf未包含在上面的列表中,因?yàn)閗ubeadm將已將其配置為自動(dòng)更新。kubeadm無(wú)法管理由外部CA簽名的證書。
$ kubeadm alpha certs check-expiration
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Jan 09, 2021 06:55 UTC 360d no
apiserver Jan 09, 2021 06:55 UTC 360d ca no
apiserver-etcd-client Jan 09, 2021 06:55 UTC 360d etcd-ca no
apiserver-kubelet-client Jan 09, 2021 06:55 UTC 360d ca no
controller-manager.conf Jan 09, 2021 06:55 UTC 360d no
etcd-healthcheck-client Jan 09, 2021 06:55 UTC 360d etcd-ca no
etcd-peer Jan 09, 2021 06:55 UTC 360d etcd-ca no
etcd-server Jan 09, 2021 06:55 UTC 360d etcd-ca no
front-proxy-client Jan 09, 2021 06:55 UTC 360d front-proxy-ca no
scheduler.conf Jan 09, 2021 06:55 UTC 360d no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca Jan 07, 2030 06:55 UTC 9y no
etcd-ca Jan 07, 2030 06:55 UTC 9y no
front-proxy-ca Jan 07, 2030 06:55 UTC 9y no
# 重新生成證書
$ kubeadm alpha certs renew all
# 重新控制平面,重啟kubelet會(huì)自動(dòng)重新創(chuàng)建核心組件
$ systemctl restart kubelet
# 查看證書狀態(tài)
$ kubeadm alpha certs check-expiration
2. kubeadm join 使用的 token 過期之后,如何加入集群
# 創(chuàng)建token
$ kubeadm token create
ll3wpn.pct6tlq66lis3uhk
# 查看token
$ kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
ll3wpn.pct6tlq66lis3uhk 23h 2020-01-17T14:42:50+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
# 獲取 CA 證書 sha256 編碼 hash 值
$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b
# 新node節(jié)點(diǎn)加入集群
$ kubeadm join 192.168.1.200:6443 --token ll3wpn.pct6tlq66lis3uhk \
--discovery-token-ca-cert-hash sha256:b5e5c1a000284781677336b00e7345838195ca78af21bddd9defad799243752b
3. 忘記kubeadm join命令怎么辦
# 執(zhí)行以下命令即可打印加入集群的命令
$ kubeadm token create --print-join-command
4. 在kubeadm init階段未開啟ipvs,后續(xù)如何修改kube-proxy的模式為ipvs
# 修改ConfigMap的kube-system/kube-proxy中的config.conf中mode: "ipvs"
$ kubectl edit cm kube-proxy -n kube-system
...
apiVersion: v1
data:
config.conf: |-
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 0
contentType: ""
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 0
clusterCIDR: 10.10.0.0/16
configSyncPeriod: 0s
conntrack:
maxPerCore: null
min: null
tcpCloseWaitTimeout: null
tcpEstablishedTimeout: null
enableProfiling: false
healthzBindAddress: ""
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
strictARP: false
syncPeriod: 0s
kind: KubeProxyConfiguration
metricsBindAddress: ""
# 修改此處即可
mode: "ipvs"
nodePortAddresses: null
oomScoreAdj: null
portRange: ""
udpIdleTimeout: 0s
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
...
# 查看已運(yùn)行的kube-proxy的Pod
$ kubectl get pods -n kube-system | grep kube-proxy
# 刪除原有的kube-proxy的Pod,控制器會(huì)自動(dòng)重建
$ kubectl get pod -n kube-system | grep kube-proxy | awk '{system("kubectl delete pod "$1" -n kube-system")}'
# 通過ConfigMap修改了kube-proxy的配置,后期增加的Node節(jié)點(diǎn),會(huì)直接使用ipvs模式。
# 查看已運(yùn)行的kube-proxy的Pod
$ kubectl get pods -n kube-system | grep kube-proxy
# 查看kube-proxy的Pod日志,確保運(yùn)行為ipvs模式。日志中打印出了'Using ipvs Proxier,說明ipvs模式已經(jīng)開啟。
$ kubectl logs kube-proxy-xxxxx -n kube-system
# 使用ipvsadm測(cè)試,可以查看之前創(chuàng)建的Service已經(jīng)使用LVS創(chuàng)建了集群。
$ ipvsadm -Ln

