kubernetes集群管理系列講座(二)安裝etcd

課程目標(biāo)

  • 安裝單機(jī)版etcd
  • 安裝etcd集群
  • 配置安全的etcd(配置SSL證書)

1. 環(huán)境

1.1. 軟件版本

環(huán)境 版本
操作系統(tǒng) linux大部分發(fā)行版都可以(ubuntu/rhel/centos)
內(nèi)核版本 3.10和4.15
etcd v3.4.9
golang 1.14.3

1.2. 硬件規(guī)劃

關(guān)于機(jī)器選項可以參考這個

Here are a few example hardware setups on AWS and GCE environments. As mentioned before, but must be stressed regardless, administrators should test an etcd deployment with a simulated workload before putting it into production.

Note that these configurations assume these machines are totally dedicated to etcd. Running other applications along with etcd on these machines may cause resource contentions and lead to cluster instability.

Small cluster

A small cluster serves fewer than 100 clients, fewer than 200 of requests per second, and stores no more than 100MB of data.

Example application workload: A 50-node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.large 2 8 3600 56.25
GCE n1-standard-2 + 50GB PD SSD 2 7.5 1500 25

Medium cluster

A medium cluster serves fewer than 500 clients, fewer than 1,000 of requests per second, and stores no more than 500MB of data.

Example application workload: A 250-node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.xlarge 4 16 6000 93.75
GCE n1-standard-4 + 150GB PD SSD 4 15 4500 75

Large cluster

A large cluster serves fewer than 1,500 clients, fewer than 10,000 of requests per second, and stores no more than 1GB of data.

Example application workload: A 1,000-node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.2xlarge 8 32 8000 125
GCE n1-standard-8 + 250GB PD SSD 8 30 7500 125

xLarge cluster

An xLarge cluster serves more than 1,500 clients, more than 10,000 of requests per second, and stores more than 1GB data.

Example application workload: A 3,000 node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.4xlarge 16 64 16,000 250
GCE n1-standard-16 + 500GB PD SSD 16 60 15,000 250

2. 安裝單機(jī)版etcd

2.1. 二進(jìn)制包安裝etcd

  • 下載和解壓

    這里有下載的向?qū)?,也可以點這里直接下載v3.4.9

#設(shè)置要下載的版本
ETCD_VER=v3.4.9
INSTALL_DIR=/opt

# 也可以從google下載,鑒于國內(nèi)無法訪問,就注釋掉了
# GOOGLE_URL=https://storage.googleapis.com/etcd
GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
DOWNLOAD_URL=${GITHUB_URL}

# 清理原來下載過的
rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
rm -rf ${INSTALL_DIR}/etcd && mkdir -p ${INSTALL_DIR}/etcd

# 下載
curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
# 解壓
tar xzvf ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz -C ${INSTALL_DIR}/etcd --strip-components=1
# 刪除壓縮包
rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz

# 測試
${INSTALL_DIR}/etcd/etcd --version
${INSTALL_DIR}/etcd/etcdctl version
  • 啟動etcd
./etcd

2.2. 編譯安裝etcd

  • 官方文檔在這里

  • 注意:編譯安裝的話需要安裝1.14版本及以上的golang環(huán)境,下載golang的二進(jìn)制包,配置GOROOT和GOPATH

  • 注意:官網(wǎng)要求golang版本是1.13以上(也就是1.14版本),目前的鏡像只有RHEL8.2和AMAZON linux2的yum源中的golang是1.13版本,換句話說,截止2020年5月27日,如果想編譯安裝etcd,必須自己手動配置golang環(huán)境

  • 注意:目前版本上使用編譯后的etcd創(chuàng)建集群的時候會出現(xiàn)錯誤,目前測試在AWS的linux2上會出現(xiàn)這個問題,所以只有二進(jìn)制方式安裝最保險

    panic: runtime error: invalid memory address or nil pointer dereference
    
  • 下載golang
# 國內(nèi)無法訪問google,請使用下面的鏈接下載二進(jìn)制包
wget https://studygolang.com/dl/golang/go1.14.3.linux-amd64.tar.gz
# 解壓
tar xf go1.14.3.linux-amd64.tar.gz
# 驗證
./go/bin/go version
go version go1.14.3 linux/amd64
  • 配置go環(huán)境
cat << EOF > /etc/profile.d/golong.sh
export GOROOT=/opt/go
export PATH=$PATH:/opt/go/bin
EOF

source /etc/profile
  • 編譯
$ git clone https://github.com/etcd-io/etcd.git
$ cd etcd
$ go env -w GOPROXY=https://goproxy.cn,direct
$ go mod vendor
$ ./build
  • 驗證
# 在bin目錄下面會多出兩個可執(zhí)行文件
ls bin/
etcd  etcdctl

# 查看版本
$ ./bin/etcd --version
etcd Version: 3.5.0-pre
Git SHA: 9b6c3e337
Go Version: go1.14.3
Go OS/Arch: linux/amd64

$ ./bin/etcdctl version
etcdctl version: 3.5.0-pre
API version: 3.5

3. 安裝etcd集群

如果是簡單的demo,我們可以參考官方文檔。這里介紹的是在生產(chǎn)環(huán)境上搭建etcd集群。

3.1. 準(zhǔn)備環(huán)境

  • 配置時間服務(wù),ntpd和chrony都可以

  • 為etcd創(chuàng)建獨立的文件系統(tǒng),在公有云環(huán)境中,系統(tǒng)基本都是鏡像啟動的,實例被干掉之后容易丟數(shù)據(jù),而且速度不如外掛存儲卷,且穩(wěn)定性好。

    注意:不管我們的數(shù)據(jù)放在哪里,都有丟失的危險,一定要記得備份!etcd再輕量也是數(shù)據(jù)庫,數(shù)據(jù)丟了,什么都沒了

    $ lvcreate -n lv_etcd -L 10G vg_system
    $ mkfs.xfs /dev/mapper/vg_system-lv_etcd
    $ mkdir -p /data/etcd
    
  • 權(quán)限控制

    # 創(chuàng)建etcd用戶,只用來跑程序
    $ useradd etcd
    # 修改etcd的主要組為adm,便于同屬于adm組的管理員查看,并且指定不可以登錄
    $ usermod -g adm -s nologin etcd
    # 修改數(shù)據(jù)的權(quán)限,etcd用戶擁有所有的權(quán)限
    # etcd所在的組adm擁有讀和執(zhí)行的權(quán)限,方便管理員查看或者備份,也可以給他只讀權(quán)限
    # 其他人員沒有權(quán)限
    # root擁有所有權(quán)限
    $ chmod 740 /data/etcd
    $ chown etcd:adm /data/etcd
    $ mkdir /etc/etcd
    $ chmod 740 /data/etcd
    $ chown etcd:adm /etc/etcd
    

3.2. 二進(jìn)制包安裝etcd

  • 下載和解壓

    #設(shè)置要下載的版本
    ETCD_VER=v3.4.9
    INSTALL_DIR=/opt
    
    # 也可以從google下載,鑒于國內(nèi)無法訪問,就注釋掉了
    # GOOGLE_URL=https://storage.googleapis.com/etcd
    GITHUB_URL=https://github.com/etcd-io/etcd/releases/download
    DOWNLOAD_URL=${GITHUB_URL}
    
    # 清理原來下載過的
    rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
    rm -rf ${INSTALL_DIR}/etcd && mkdir -p ${INSTALL_DIR}/etcd
    
    # 下載
    curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
    # 解壓
    tar xzvf ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz -C ${INSTALL_DIR}/etcd --strip-components=1
    # 刪除壓縮包
    rm -f ${INSTALL_DIR}/etcd-${ETCD_VER}-linux-amd64.tar.gz
    
    # 測試
    ${INSTALL_DIR}/etcd/etcd --version
    ${INSTALL_DIR}/etcd/etcdctl version
    
  • 配置etcd的路徑

    cat << EOF > /etc/profile.d/etcd.sh
    export PATH=$PATH:/opt/etcd
    EOF
    
    source /etc/profile
    
  • 生成一個長一點的token保證安全

    $ echo k8s-cluster|md5sum
    ea8cfe2bfe85b7e6c66fe190f9225838  -
    
  • 配置文件/etc/etcd/etcd.conf

    master1

    DATA_DIR=/data/etcd
    HOST_NAME=master1
    HOST_IP=10.0.1.204
    CLUSTER=master1=http://10.0.1.204:2380,master2=http://10.0.1.67:2380,master3=http://10.0.1.236:2380
    CLUSTER_STATE=new
    TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
    

    master2

    DATA_DIR=/data/etcd
    HOST_NAME=master2
    HOST_IP=10.0.1.67
    CLUSTER=master1=http://10.0.1.204:2380,master2=http://10.0.1.67:2380,master3=http://10.0.1.236:2380
    CLUSTER_STATE=new
    TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
    

    master3

    DATA_DIR=/data/etcd
    HOST_NAME=master3
    HOST_IP=10.0.1.236
    CLUSTER=master1=http://10.0.1.204:2380,master2=http://10.0.1.67:2380,master3=http://10.0.1.236:2380
    CLUSTER_STATE=new
    TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
    
  • 編輯systemd服務(wù)文件 /usr/lib/systemd/etcd.service(rhel系列的)或者/lib/systemd/system/etcd.service(ubuntu系列)

    [Unit]
    Description=Etcd Server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    
    [Service]
    Type=notify
    WorkingDirectory=/data/etcd
    EnvironmentFile=-/etc/etcd/etcd.conf
    User=etcd
    # set GOMAXPROCS to number of processors
    ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /opt/etcd/etcd \
              --data-dir ${DATA_DIR} \--name \"${HOST_NAME}\" \
              --initial-advertise-peer-urls http://${HOST_IP}:2380 \
              --listen-peer-urls http://${HOST_IP}:2380 \
              --advertise-client-urls http://${HOST_IP}:2379 \
              --listen-client-urls http://${HOST_IP}:2379 \
              --initial-cluster ${CLUSTER} \
              --initial-cluster-state ${CLUSTER_STATE} \
              --initial-cluster-token ${TOKEN}"
    Restart=on-failure
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    
  • 啟動

    systemctl daemon-reload
    systemctl start etcd
    
  • 查看狀態(tài)

    etcdctl endpoint health
    

    注意:etcdctl endpoint health命令不加參數(shù)的話,默認(rèn)是訪問本地的2379端口,也就是127.0.0.1:2379,但是咱們剛才配置集群的時候是沒有監(jiān)聽本地端口的,所以要使用--endpoint命令指定端口。否則會報錯

    127.0.0.1:2379 is unhealthy: failed to commit proposal: context deadline exceeded
    

    可悲的是,如果使用endpoint參數(shù),就必須使用https協(xié)議,也就是必須使用證書

    etcdctl --endpoints=https://10.0.1.236:2379 --cacert=/etc/k8s/ssl/etcd-root-ca.pem --key=/etc/k8s/ssl/etcd-key.pem  --cert=/etc/k8s/ssl/etcd.pem  endpoint health 
    

    我們目前還沒配置證書,只能從日志中查看etcd的狀態(tài)是否正常

    journalctl -u etcd
    

    而etcd的輸出位置是沒有參數(shù)去指定的,他的默認(rèn)輸出是stdout,會由journald來管理

    Logging:
      --logger 'capnslog'
        Specify 'zap' for structured logging or 'capnslog'. [WARN] 'capnslog' will be deprecated in v3.5.
      --log-outputs 'default'
        Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd, or list of comma separated output targets.
      --log-level 'info'
        Configures log level. Only supports debug, info, warn, error, panic, or fatal.
    

4. 配置安全的etcd

4.1. 制作證書

參考github

4.1.1. 準(zhǔn)備環(huán)境

  • 下載cfssl工具

    curl -s -L -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
    curl -s -L -o /usr/local/bin/cfssl https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
    chmod +x /usr/local/bin/{cfssl,cfssljson}
    
  • 測試一下

    $ cfssl
    No command is given.
    Usage:
    Available commands:
      serve
      version
      genkey
      gencrl
      ocsprefresh
      selfsign
      scan
      print-defaults
      revoke
      bundle
      sign
      gencert
      ocspdump
      ocspserve
      info
      certinfo
      ocspsign
    Top-level flags:
      -allow_verification_with_non_compliant_keys
          Allow a SignatureVerifier to use keys which are technically non-compliant with RFC6962.
      -loglevel int
          Log level (0 = DEBUG, 5 = FATAL) (default 1)
    
  • 創(chuàng)建工作目錄

    $ mkdir -p /etc/kubernetes/pki/etcd
    

4.1.2. 創(chuàng)建CA相關(guān)證書

  • 創(chuàng)建CA配置文件(默認(rèn)創(chuàng)建)

    cd /etc/kubernetes/pki/etcd
    cfssl print-defaults config > ca-config.json
    cfssl print-defaults csr > ca-csr.json
    
  • 修改ca-config.json

    {
        "signing": {
            "default": {
                "expiry": "43800h"
            },
            "profiles": {
                "server": {
                    "expiry": "43800h",
                    "usages": [
                        "signing",
                        "key encipherment",
                        "server auth"
                    ]
                },
                "client": {
                    "expiry": "43800h",
                    "usages": [
                        "signing",
                        "key encipherment",
                        "client auth"
                    ]
                },
                "peer": {
                    "expiry": "43800h",
                    "usages": [
                        "signing",
                        "key encipherment",
                        "server auth",
                        "client auth"
                    ]
                }
            }
        }
    }
    
  • 修改ca-csr.json

    {
        "CN": "My own CA",
        "key": {
            "algo": "rsa",
            "size": 2048
        },
        "names": [
            {
                "C": "US",
                "L": "CA",
                "O": "My Company Name",
                "ST": "San Francisco",
                "OU": "Org Unit 1",
                "OU": "Org Unit 2"
            }
        ]
    }
    
  • 創(chuàng)建CA的證書

    cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
    
  • 會得到三個文件

    ca-key.pem
    ca.csr
    ca.pem
    

4.1.3. 創(chuàng)建服務(wù)器(server)相關(guān)證書

  • 生成配置文件

    cfssl print-defaults csr > server.json
    
  • 修改server.json中CN和host的部分

    {
        "CN": "etcd",
        "hosts": [
            "127.0.0.1",
            "10.0.1.204",
            "10.0.1.67",
            "10.0.1.236"
        ],
        "key": {
            "algo": "ecdsa",
            "size": 256
        },
        "names": [
            {
                "C": "US",
                "L": "CA",
                "ST": "San Francisco"
            }
        ]
    }
    
  • 生成證書

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server
    
  • 同樣是三個文件(這個是服務(wù)器啟動時候的證書)

    server-key.pem
    server.csr
    server.pem
    

4.1.4. 創(chuàng)建服務(wù)器互相通訊(peer)的相關(guān)證書

  • 生成配置文件

    cfssl print-defaults csr > members.json
    
  • 修改server.json中CN和host的部分

    {
        "CN": "members",
        "hosts": [
            "127.0.0.1",
            "10.0.1.204",
            "10.0.1.67",
            "10.0.1.236"
        ],
        "key": {
            "algo": "ecdsa",
            "size": 256
        },
        "names": [
            {
                "C": "US",
                "L": "CA",
                "ST": "San Francisco"
            }
        ]
    }
    
  • 生成證書

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer members.json | cfssljson -bare members
    
  • 三個文件

    members-key.pem
    members.csr
    members.pem
    

4.1.5. 創(chuàng)建客戶端(client)的相關(guān)證書

  • 生成配置文件

    cfssl print-defaults csr > client.json
    
  • 修改server.json中CN和host的部分

    注意:一般來說,如果etcd需要手動創(chuàng)建的話,架構(gòu)上會把這三臺etcd獨立拿出來作為數(shù)據(jù)庫來管理,所以客戶端會的hosts是etcd之外的IP地址,但是我們這里實驗是使用這三臺etcd作為客戶端的,所以地址還是這三臺機(jī)器。如果實在不明白,就把所有的機(jī)器ipDNS名稱都寫在這個hosts里面,或者讓hosts留空(下面的例子),防止出錯,不過這樣并不算最安全的選擇。

    {
        "CN": "client",
        "hosts": [""],
        "key": {
            "algo": "ecdsa",
            "size": 256
        },
        "names": [
            {
                "C": "US",
                "L": "CA",
                "ST": "San Francisco"
            }
        ]
    }
    
  • 生成證書

    cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client.json | cfssljson -bare client
    
  • 又得到一組證書

    client-key.pem
    client.csr
    client.pem
    

4.2. 配置etcd使用ssl證書

  • 把剛才生成的所有證書(在/etc/kubernetes/pki/etcd下的所有文件)都復(fù)制到另外的etcd機(jī)器上去。

  • 修改啟動文件

[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target

[Service]
Type=notify
WorkingDirectory=/data/etcd
EnvironmentFile=-/etc/etcd/etcd.conf
User=etcd
# set GOMAXPROCS to number of processors
ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /opt/etcd/etcd \
          --data-dir ${DATA_DIR} \
          --name ${HOST_NAME} \
          --initial-advertise-peer-urls https://${HOST_IP}:2380 \
          --listen-peer-urls https://${HOST_IP}:2380 \
          --advertise-client-urls https://${HOST_IP}:2379 \
          --listen-client-urls https://127.0.0.1:2379,https://${HOST_IP}:2379 \
          --listen-metrics-urls=http://127.0.0.1:2381 \
          --initial-cluster ${CLUSTER} \
          --initial-cluster-state ${CLUSTER_STATE} \
          --initial-cluster-token ${TOKEN} \
          --client-cert-auth \
          --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
          --cert-file=/etc/kubernetes/pki/etcd/server.pem \
          --key-file=/etc/kubernetes/pki/etcd/server-key.pem \
          --peer-client-cert-auth \
          --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.pem \
          --peer-cert-file=/etc/kubernetes/pki/etcd/members.pem \
          --peer-key-file=/etc/kubernetes/pki/etcd/members-key.pem"
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
  • 修改配置文件,/etc/etcd/etcd.conf

    master1

    DATA_DIR=/data/etcd
    HOST_NAME=master1
    HOST_IP=10.0.1.204
    CLUSTER=master1=https://10.0.1.204:2380,master2=https://10.0.1.67:2380,master3=https://10.0.1.236:2380
    CLUSTER_STATE=new
    TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
    

    master2

    DATA_DIR=/data/etcd
    HOST_NAME=master2
    HOST_IP=10.0.1.67
    CLUSTER=master1=https://10.0.1.204:2380,master2=https://10.0.1.67:2380,master3=https://10.0.1.236:2380
    CLUSTER_STATE=new
    TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
    

    master3

    DATA_DIR=/data/etcd
    HOST_NAME=master3
    HOST_IP=10.0.1.236
    CLUSTER=master1=https://10.0.1.204:2380,master2=https://10.0.1.67:2380,master3=https://10.0.1.236:2380
    CLUSTER_STATE=new
    TOKEN=ea8cfe2bfe85b7e6c66fe190f9225838
    
  • 修改權(quán)限

    chmod 400 /etc/kubernetes/pki/etcd/*
    chown -R etcd:adm /etc/kubernetes/pki/etcd/
    
  • 重啟啟動集群會報錯

    error "tls: first record does not look like a TLS handshake
    
  • 刪除數(shù)據(jù)文件,重新啟動就好了

為了方便大家學(xué)習(xí),請大家加我的微信,我會把大家加到微信群(微信群的二維碼會經(jīng)常變)和qq群821119334,問題答案云原生技術(shù)課堂,有問題可以一起討論

  • 個人微信
    640.jpeg

  • 騰訊課堂
    640-20200506145837072.jpeg

  • 微信公眾號
    640-20200506145842007.jpeg

  • 專題講座

2020 CKA考試視頻 真題講解 https://www.bilibili.com/video/BV167411K7hp

2020 CKA考試指南 https://www.bilibili.com/video/BV1sa4y1479B/

2020年 5月CKA考試真題 https://mp.weixin.qq.com/s/W9V4cpYeBhodol6AYtbxIA

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

友情鏈接更多精彩內(nèi)容