課程目標(biāo)
- 安裝單機(jī)版etcd
- 安裝etcd集群
- 配置安全的etcd(配置SSL證書)
1. 環(huán)境
1.1. 軟件版本
| 環(huán)境 | 版本 |
|---|---|
| 操作系統(tǒng) | linux大部分發(fā)行版都可以(ubuntu/rhel/centos) |
| 內(nèi)核版本 | 3.10和4.15 |
| etcd | v3.4.9 |
| golang | 1.14.3 |
1.2. 硬件規(guī)劃
關(guān)于機(jī)器選項可以參考這個
Here are a few example hardware setups on AWS and GCE environments. As mentioned before, but must be stressed regardless, administrators should test an etcd deployment with a simulated workload before putting it into production.
Note that these configurations assume these machines are totally dedicated to etcd. Running other applications along with etcd on these machines may cause resource contentions and lead to cluster instability.
Small cluster
A small cluster serves fewer than 100 clients, fewer than 200 of requests per second, and stores no more than 100MB of data.
Example application workload: A 50-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|---|---|---|---|---|---|
| AWS | m4.large | 2 | 8 | 3600 | 56.25 |
| GCE | n1-standard-2 + 50GB PD SSD | 2 | 7.5 | 1500 | 25 |
Medium cluster
A medium cluster serves fewer than 500 clients, fewer than 1,000 of requests per second, and stores no more than 500MB of data.
Example application workload: A 250-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|---|---|---|---|---|---|
| AWS | m4.xlarge | 4 | 16 | 6000 | 93.75 |
| GCE | n1-standard-4 + 150GB PD SSD | 4 | 15 | 4500 | 75 |
Large cluster
A large cluster serves fewer than 1,500 clients, fewer than 10,000 of requests per second, and stores no more than 1GB of data.
Example application workload: A 1,000-node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|---|---|---|---|---|---|
| AWS | m4.2xlarge | 8 | 32 | 8000 | 125 |
| GCE | n1-standard-8 + 250GB PD SSD | 8 | 30 | 7500 | 125 |
xLarge cluster
An xLarge cluster serves more than 1,500 clients, more than 10,000 of requests per second, and stores more than 1GB data.
Example application workload: A 3,000 node Kubernetes cluster
| Provider | Type | vCPUs | Memory (GB) | Max concurrent IOPS | Disk bandwidth (MB/s) |
|---|---|---|---|---|---|
| AWS | m4.4xlarge | 16 | 64 | 16,000 | 250 |
| GCE | n1-standard-16 + 500GB PD SSD | 16 | 60 | 15,000 | 250 |
2. 安裝單機(jī)版etcd
2.1. 二進(jìn)制包安裝etcd
#設(shè)置要下載的版本
ETCD_VER=v3.4.9
INSTALL_DIR=/opt
# 也可以從google下載,鑒于國內(nèi)無法訪問,就注釋掉了
# GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GITHUB_URL}
# 清理原來下載過的
rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
rm -rf ${INSTALL_DIR}/etcd && mkdir -p ${INSTALL_DIR}/etcd
# 下載
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
# 解壓
tar xzvf ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz -C ${INSTALL_DIR}/etcd --strip-components=1
# 刪除壓縮包
rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
# 測試
${INSTALL_DIR}/etcd/etcd --version
${INSTALL_DIR}/etcd/etcdctl version
- 啟動etcd
./etcd
2.2. 編譯安裝etcd
官方文檔在這里
注意:編譯安裝的話需要安裝1.14版本及以上的golang環(huán)境,下載golang的二進(jìn)制包,配置GOROOT和GOPATH
注意:官網(wǎng)要求golang版本是1.13以上(也就是1.14版本),目前的鏡像只有RHEL8.2和AMAZON linux2的yum源中的golang是1.13版本,換句話說,截止2020年5月27日,如果想編譯安裝etcd,必須自己手動配置golang環(huán)境
-
注意:目前版本上使用編譯后的etcd創(chuàng)建集群的時候會出現(xiàn)錯誤,目前測試在AWS的linux2上會出現(xiàn)這個問題,所以只有二進(jìn)制方式安裝最保險
panic: runtime error: invalid memory address or nil pointer dereference
- 下載golang
# 國內(nèi)無法訪問google,請使用下面的鏈接下載二進(jìn)制包
wget https://studygolang.com/dl/golang/go1.14.3.linux-amd64.tar.gz
# 解壓
tar xf go1.14.3.linux-amd64.tar.gz
# 驗證
./go/bin/go version
go version go1.14.3 linux/amd64
- 配置go環(huán)境
cat << EOF > /etc/profile.d/golong.sh
export GOROOT=/opt/go
export PATH=$PATH:/opt/go/bin
EOF
source /etc/profile
- 編譯
$ git clone https://github.com/etcd-io/etcd.git
$ cd etcd
$ go env -w GOPROXY=https://goproxy.cn,direct
$ go mod vendor
$ ./build
- 驗證
# 在bin目錄下面會多出兩個可執(zhí)行文件
ls bin/
etcd etcdctl
# 查看版本
$ ./bin/etcd --version
etcd Version: 3.5.0-pre
Git SHA: 9b6c3e337
Go Version: go1.14.3
Go OS/Arch: linux/amd64
$ ./bin/etcdctl version
etcdctl version: 3.5.0-pre
API version: 3.5
3. 安裝etcd集群
如果是簡單的demo,我們可以參考官方文檔。這里介紹的是在生產(chǎn)環(huán)境上搭建etcd集群。
3.1. 準(zhǔn)備環(huán)境
配置時間服務(wù),ntpd和chrony都可以
-
為etcd創(chuàng)建獨立的文件系統(tǒng),在公有云環(huán)境中,系統(tǒng)基本都是鏡像啟動的,實例被干掉之后容易丟數(shù)據(jù),而且速度不如外掛存儲卷,且穩(wěn)定性好。
注意:不管我們的數(shù)據(jù)放在哪里,都有丟失的危險,一定要記得備份!etcd再輕量也是數(shù)據(jù)庫,數(shù)據(jù)丟了,什么都沒了
$ lvcreate -n lv_etcd -L 10G vg_system $ mkfs.xfs /dev/mapper/vg_system-lv_etcd $ mkdir -p /data/etcd -
權(quán)限控制
# 創(chuàng)建etcd用戶,只用來跑程序 $ useradd etcd # 修改etcd的主要組為adm,便于同屬于adm組的管理員查看,并且指定不可以登錄 $ usermod -g adm -s nologin etcd # 修改數(shù)據(jù)的權(quán)限,etcd用戶擁有所有的權(quán)限 # etcd所在的組adm擁有讀和執(zhí)行的權(quán)限,方便管理員查看或者備份,也可以給他只讀權(quán)限 # 其他人員沒有權(quán)限 # root擁有所有權(quán)限 $ chmod 740 /data/etcd $ chown etcd:adm /data/etcd $ mkdir /etc/etcd $ chmod 740 /data/etcd $ chown etcd:adm /etc/etcd
3.2. 二進(jìn)制包安裝etcd
-
下載和解壓
#設(shè)置要下載的版本 ETCD_VER=v3.4.9 INSTALL_DIR=/opt # 也可以從google下載,鑒于國內(nèi)無法訪問,就注釋掉了 # GOOGLE_URL=https://storage.googleapis.com/etcd GITHUB_URL=https://github.com/etcd-io/etcd/releases/download DOWNLOAD_URL=${GITHUB_URL} # 清理原來下載過的 rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz rm -rf ${INSTALL_DIR}/etcd && mkdir -p ${INSTALL_DIR}/etcd # 下載 curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz # 解壓 tar xzvf ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz -C ${INSTALL_DIR}/etcd --strip-components=1 # 刪除壓縮包 rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz # 測試 ${INSTALL_DIR}/etcd/etcd --version ${INSTALL_DIR}/etcd/etcdctl version -
配置etcd的路徑
cat << EOF > /etc/profile.d/etcd.sh export PATH=$PATH:/opt/etcd EOF source /etc/profile -
生成一個長一點的token保證安全
$ echo k8s-cluster|md5sum ea8cfe2bfe85b7e6c66fe190f9225838 - -
配置文件/etc/etcd/etcd.conf
master1
DATA_DIR=/data/etcd HOST_NAME=master1 HOST_IP=10.0.1.204 CLUSTER=master1=http://10.0.1.204:2380,master2=http://10.0.1.67:2380,master3=http://10.0.1.236:2380 CLUSTER_STATE=new TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838master2
DATA_DIR=/data/etcd HOST_NAME=master2 HOST_IP=10.0.1.67 CLUSTER=master1=http://10.0.1.204:2380,master2=http://10.0.1.67:2380,master3=http://10.0.1.236:2380 CLUSTER_STATE=new TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838master3
DATA_DIR=/data/etcd HOST_NAME=master3 HOST_IP=10.0.1.236 CLUSTER=master1=http://10.0.1.204:2380,master2=http://10.0.1.67:2380,master3=http://10.0.1.236:2380 CLUSTER_STATE=new TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
-
編輯systemd服務(wù)文件 /usr/lib/systemd/etcd.service(rhel系列的)或者/lib/systemd/system/etcd.service(ubuntu系列)
[Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify WorkingDirectory=/data/etcd EnvironmentFile=-/etc/etcd/etcd.conf User=etcd # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /opt/etcd/etcd \ --data-dir ${DATA_DIR} \--name \"${HOST_NAME}\" \ --initial-advertise-peer-urls http://${HOST_IP}:2380 \ --listen-peer-urls http://${HOST_IP}:2380 \ --advertise-client-urls http://${HOST_IP}:2379 \ --listen-client-urls http://${HOST_IP}:2379 \ --initial-cluster ${CLUSTER} \ --initial-cluster-state ${CLUSTER_STATE} \ --initial-cluster-token ${TOKEN}" Restart=on-failure LimitNOFILE=65536 [Install] WantedBy=multi-user.target -
啟動
systemctl daemon-reload systemctl start etcd -
查看狀態(tài)
etcdctl endpoint health注意:etcdctl endpoint health命令不加參數(shù)的話,默認(rèn)是訪問本地的2379端口,也就是127.0.0.1:2379,但是咱們剛才配置集群的時候是沒有監(jiān)聽本地端口的,所以要使用
--endpoint命令指定端口。否則會報錯127.0.0.1:2379 is unhealthy: failed to commit proposal: context deadline exceeded可悲的是,如果使用endpoint參數(shù),就必須使用https協(xié)議,也就是必須使用證書
etcdctl --endpoints=https://10.0.1.236:2379 --cacert=/etc/k8s/ssl/etcd-root-ca.pem --key=/etc/k8s/ssl/etcd-key.pem --cert=/etc/k8s/ssl/etcd.pem endpoint health我們目前還沒配置證書,只能從日志中查看etcd的狀態(tài)是否正常
journalctl -u etcd而etcd的輸出位置是沒有參數(shù)去指定的,他的默認(rèn)輸出是stdout,會由journald來管理
Logging: --logger 'capnslog' Specify 'zap' for structured logging or 'capnslog'. [WARN] 'capnslog' will be deprecated in v3.5. --log-outputs 'default' Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd, or list of comma separated output targets. --log-level 'info' Configures log level. Only supports debug, info, warn, error, panic, or fatal.
4. 配置安全的etcd
4.1. 制作證書
參考github
4.1.1. 準(zhǔn)備環(huán)境
-
下載cfssl工具
curl -s -L -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 curl -s -L -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 chmod +x /usr/local/bin/{cfssl,cfssljson} -
測試一下
$ cfssl No command is given. Usage: Available commands: serve version genkey gencrl ocsprefresh selfsign scan print-defaults revoke bundle sign gencert ocspdump ocspserve info certinfo ocspsign Top-level flags: -allow_verification_with_non_compliant_keys Allow a SignatureVerifier to use keys which are technically non-compliant with RFC6962. -loglevel int Log level (0 = DEBUG, 5 = FATAL) (default 1) -
創(chuàng)建工作目錄
$ mkdir -p /etc/kubernetes/pki/etcd
4.1.2. 創(chuàng)建CA相關(guān)證書
-
創(chuàng)建CA配置文件(默認(rèn)創(chuàng)建)
cd /etc/kubernetes/pki/etcd cfssl print-defaults config > ca-config.json cfssl print-defaults csr > ca-csr.json -
修改
ca-config.json為{ "signing": { "default": { "expiry": "43800h" }, "profiles": { "server": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth" ] }, "client": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "client auth" ] }, "peer": { "expiry": "43800h", "usages": [ "signing", "key encipherment", "server auth", "client auth" ] } } } } -
修改
ca-csr.json為{ "CN": "My own CA", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "US", "L": "CA", "O": "My Company Name", "ST": "San Francisco", "OU": "Org Unit 1", "OU": "Org Unit 2" } ] } -
創(chuàng)建CA的證書
cfssl gencert -initca ca-csr.json | cfssljson -bare ca - -
會得到三個文件
ca-key.pem ca.csr ca.pem
4.1.3. 創(chuàng)建服務(wù)器(server)相關(guān)證書
-
生成配置文件
cfssl print-defaults csr > server.json -
修改server.json中CN和host的部分
{ "CN": "etcd", "hosts": [ "127.0.0.1", "10.0.1.204", "10.0.1.67", "10.0.1.236" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] } -
生成證書
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server -
同樣是三個文件(這個是服務(wù)器啟動時候的證書)
server-key.pem server.csr server.pem
4.1.4. 創(chuàng)建服務(wù)器互相通訊(peer)的相關(guān)證書
-
生成配置文件
cfssl print-defaults csr > members.json -
修改server.json中CN和host的部分
{ "CN": "members", "hosts": [ "127.0.0.1", "10.0.1.204", "10.0.1.67", "10.0.1.236" ], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] } -
生成證書
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer members.json | cfssljson -bare members -
三個文件
members-key.pem members.csr members.pem
4.1.5. 創(chuàng)建客戶端(client)的相關(guān)證書
-
生成配置文件
cfssl print-defaults csr > client.json -
修改server.json中CN和host的部分
注意:一般來說,如果etcd需要手動創(chuàng)建的話,架構(gòu)上會把這三臺etcd獨立拿出來作為數(shù)據(jù)庫來管理,所以客戶端會的hosts是etcd之外的IP地址,但是我們這里實驗是使用這三臺etcd作為客戶端的,所以地址還是這三臺機(jī)器。如果實在不明白,就把所有的機(jī)器ip和DNS名稱都寫在這個hosts里面,或者讓hosts留空(下面的例子),防止出錯,不過這樣并不算最安全的選擇。
{ "CN": "client", "hosts": [""], "key": { "algo": "ecdsa", "size": 256 }, "names": [ { "C": "US", "L": "CA", "ST": "San Francisco" } ] } -
生成證書
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client -
又得到一組證書
client-key.pem client.csr client.pem
4.2. 配置etcd使用ssl證書
把剛才生成的所有證書(在/etc/kubernetes/pki/etcd下的所有文件)都復(fù)制到另外的etcd機(jī)器上去。
修改啟動文件
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/data/etcd
EnvironmentFile=-/etc/etcd/etcd.conf
User=etcd
# set GOMAXPROCS to number of processors
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /opt/etcd/etcd \
--data-dir ${DATA_DIR} \
--name ${HOST_NAME} \
--initial-advertise-peer-urls https://${HOST_IP}:2380 \
--listen-peer-urls https://${HOST_IP}:2380 \
--advertise-client-urls https://${HOST_IP}:2379 \
--listen-client-urls https://127.0.0.1:2379,https://${HOST_IP}:2379 \
--listen-metrics-urls=http://127.0.0.1:2381 \
--initial-cluster ${CLUSTER} \
--initial-cluster-state ${CLUSTER_STATE} \
--initial-cluster-token ${TOKEN} \
--client-cert-auth \
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--cert-file=/etc/kubernetes/pki/etcd/server.pem \
--key-file=/etc/kubernetes/pki/etcd/server-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
--peer-cert-file=/etc/kubernetes/pki/etcd/members.pem \
--peer-key-file=/etc/kubernetes/pki/etcd/members-key.pem"
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
-
修改配置文件,/etc/etcd/etcd.conf
master1
DATA_DIR=/data/etcd HOST_NAME=master1 HOST_IP=10.0.1.204 CLUSTER=master1=https://10.0.1.204:2380,master2=https://10.0.1.67:2380,master3=https://10.0.1.236:2380 CLUSTER_STATE=new TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838master2
DATA_DIR=/data/etcd HOST_NAME=master2 HOST_IP=10.0.1.67 CLUSTER=master1=https://10.0.1.204:2380,master2=https://10.0.1.67:2380,master3=https://10.0.1.236:2380 CLUSTER_STATE=new TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838master3
DATA_DIR=/data/etcd HOST_NAME=master3 HOST_IP=10.0.1.236 CLUSTER=master1=https://10.0.1.204:2380,master2=https://10.0.1.67:2380,master3=https://10.0.1.236:2380 CLUSTER_STATE=new TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
-
修改權(quán)限
chmod 400 /etc/kubernetes/pki/etcd/* chown -R etcd:adm /etc/kubernetes/pki/etcd/
-
重啟啟動集群會報錯
error "tls: first record does not look like a TLS handshake 刪除數(shù)據(jù)文件,重新啟動就好了
為了方便大家學(xué)習(xí),請大家加我的微信,我會把大家加到微信群(微信群的二維碼會經(jīng)常變)和qq群821119334,問題答案云原生技術(shù)課堂,有問題可以一起討論
-
個人微信
640.jpeg -
騰訊課堂
640-20200506145837072.jpeg -
微信公眾號
640-20200506145842007.jpeg 專題講座
2020 CKA考試視頻 真題講解 https://www.bilibili.com/video/BV167411K7hp
2020 CKA考試指南 https://www.bilibili.com/video/BV1sa4y1479B/
2020年 5月CKA考試真題 https://mp.weixin.qq.com/s/W9V4cpYeBhodol6AYtbxIA