安裝一個OpenShift 4 準生產(chǎn)集群

OpenShift是RedHat開發(fā)的PaaS,使用需要付費訂閱,它的社區(qū)版為OKD,兩者安裝方法幾乎一致,只是在操作系統(tǒng)和上層應用軟件上有不同,本文講述OKD的安裝。
集群環(huán)境

集群主機配置

注意:
集群中的主機是普通的PC,是實實在在的主機,不是虛擬機;
各角色的主機內(nèi)存不要低于16GB,特別作為“Worker Host”的機器,因為僅部署“openshift-logging”就會消耗不小的內(nèi)存,所以內(nèi)存建議越大越好;
如果計劃部署“Storage”(比如Ceph)在“Worker Host”,搭建集群前就給主機安裝所需的非系統(tǒng)硬盤,否則還要關(guān)機安裝硬盤;
集群搭建過程需要從Quay.io下載很多的鏡像,如果你的網(wǎng)絡很慢,那安裝時間將會相當長,建議配置一個鏡像站點或者想辦法改善自己的網(wǎng)絡環(huán)境。
下面開始演示安裝過程,總共分六個部分:
一,DHCP和DNS
安裝集群前,需要完成DHCP和DNS的配置
1,DHCP
集群主機使用PXE方式安裝操作系統(tǒng),通過DHCP得到網(wǎng)絡地址信息。配置DHCP主要是下面兩點(筆者使用是Windows Server 2008自帶的DHCP服務)
(1)綁定主機的IP和MAC,方便對DNS進行配置
(2)配置PXE相關(guān)
DHCP配置

這里使用的bootfile文件為“l(fā)pxelinux.0”,原因后面解釋。
2,DNS
本來也準備使用Windows Server 2008自帶的DNS服務,剛好有個CoreDNS容器,就使用了它,配置文件如下,

$ORIGIN okd-infra.wumi.ai.     ; designates the start of this zone file in the namespace
$TTL 1h                  ; default expiration time of all resource records without their own TTL value
okd-infra.wumi.ai.  IN  SOA   ns.okd-infra.wumi.ai. host-1.example.xyz. ( 2007120710 1d 2h 4w 1h )
okd-infra.wumi.ai.  IN  NS    ns                    ; ns.example.com is a nameserver for example.com
okd-infra.wumi.ai.  IN  A     10.1.95.9             ; IPv4 address for example.com
ns            IN  A     10.1.95.9             ; IPv4 address for ns.example.com

bootstrap     IN  A     10.1.99.7 
master-1      IN  A     10.1.99.11           
master-2      IN  A     10.1.99.3           
master-3      IN  A     10.1.99.8           
worker-1      IN  A     10.1.99.14           
worker-2      IN  A     10.1.99.15           
worker-3      IN  A     10.1.99.16           

etcd-0        IN  A     10.1.99.11          
etcd-1        IN  A     10.1.99.3          
etcd-2        IN  A     10.1.99.8          
_etcd-server-ssl._tcp 86400 IN SRV 0 10 2380 etcd-0 
_etcd-server-ssl._tcp 86400 IN SRV 0 10 2380 etcd-1 
_etcd-server-ssl._tcp 86400 IN SRV 0 10 2380 etcd-2 

api              IN  A     10.1.95.9          ; host-1 haproxy
api-int        IN  A     10.1.95.9          ; host-1 haproxy
*.apps        IN  A     10.1.95.9          ; host-1 haproxy

二,HAProxy
HAProxy主要實現(xiàn)對APIServer和Ingress負載均衡訪問,直接上配置文件,
/etc/haproxy/haproxy.cfg

defaults
    mode                    tcp
    option                  dontlognull
    timeout connect      10s
    timeout client          1m
    timeout server          1m
    
    
#---------------------------------------------------------------------
frontend openshift-api-server
    bind    10.1.95.9:6443
    default_backend api-backend
    mode tcp
#---------------------------------------------------------------------
backend api-backend
    balance source
    mode        tcp
#    server bootstrap   10.1.99.7:6443  check port 6443
    server  master-1    10.1.99.11:6443  check port 6443
    server  master-2    10.1.99.3:6443  check port 6443
    server  master-3    10.1.99.8:6443  check port 6443


#---------------------------------------------------------------------
frontend machine-config-server
    bind    10.1.95.9:22623
    default_backend machine-config-server
    mode tcp
#---------------------------------------------------------------------
backend machine-config-server
    balance source
    mode        tcp
#    server bootstrap   10.1.99.7:22623  check port 22623
    server  master-1    10.1.99.11:22623  check port 22623
    server  master-2    10.1.99.3:22623  check port 22623
    server  master-3    10.1.99.8:22623  check port 22623


#---------------------------------------------------------------------
frontend ingress-http
    bind    10.1.95.9:80
    default_backend ingress-http
    mode tcp
#---------------------------------------------------------------------
backend ingress-http
    balance source
    mode        tcp
    server  worker-1    10.1.99.14:80  check port 80
    server  worker-2    10.1.99.15:80  check port 80
    server  worker-3    10.1.99.16:80  check port 80


#---------------------------------------------------------------------
frontend ingress-https
    bind    10.1.95.9:443
    default_backend ingress-https
    mode tcp
#---------------------------------------------------------------------
backend ingress-https
    balance source
    mode        tcp
    server  worker-1    10.1.99.14:443  check port 443
    server  worker-2    10.1.99.15:443  check port 443
    server  worker-3    10.1.99.16:443  check port 443


#---------------------------------------------------------------------
listen  admin_stats  # 網(wǎng)頁管理頁面
    bind 0.0.0.0:8081  
    mode http
    log 127.0.0.1 local0 err
    stats refresh 10s
    stats uri /haproxy
    stats realm welcome login\ Haproxy
    stats hide-version
    stats admin if TRUE

初始配置“backend machine-config-server”和“backend api-backend”不要注釋bootstrap部分,安裝過程中,如果命令“./openshift-install --dir=<installation_directory> wait-for bootstrap-complete --log-level=info”輸出結(jié)果提示移除bootstrap再注釋bootstrap
三,下載需要的軟件并準備安裝配置文件
1,下載需要的軟件
(1) 從“https://github.com/openshift/okd/releases”下載集群安裝工具:openshift-install,該工具協(xié)助在公有云和本地基礎設施上部署OpenShift 4集群
(2)安裝集群管理工具:oc,從“ https://mirror.openshift.com/pub/openshift-v4/clients/oc/latest/”下載最新版,oc可以通過命令行的方式連接管理集群
2,定制安裝配置文件
OpenShift 4集群安裝和OpenShift 3完全不同,在安裝前,需要定制安裝配置文件,下面是一個樣例配置文件(文件名必須為install-config.yaml):

apiVersion: v1
baseDomain: wumi.ai
compute:
- hyperthreading: Enabled
  name: worker
  replicas: 0   //在自維護的物理機上部署集群,需設置為0
controlPlane:
  hyperthreading: Enabled
  name: master
  replicas: 3
metadata:
  name: okd-infra
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
    none: {}
fips: false
pullSecret: 'pullsecret obtained from redhat'
sshKey: 'sshkey that is created by ssh-keygen command'

搭建的集群里面有三臺作為“master”,主要運行apiserver、etcd cluster等。
“pullSecret”:從紅帽官網(wǎng)獲得,部署集群需要的鏡像存儲在Quay.io,該key主要用來驗證并獲取鏡像。
“sshKey”:是ssh public key,可以通過命令“ssh-keygen”獲得,在配置對應私鑰的主機上,使用ssh命令可以直接登錄集群中的服務器,不用輸入密碼,方便對集群進行調(diào)試等。
四,生成k8s manifest和ignition配置文件
1,生成k8s manifest文件
創(chuàng)建目錄“config-install”,將上個步驟中編寫的集群安裝配置文件“install-config.yaml”拷貝到該目錄,然后執(zhí)行下面命令,

./openshift-install create manifests --dir=config-install

執(zhí)行后,安裝程序在目錄“config-install”中生成manifests文件(文件install-config.yaml會被消耗掉)
我們不打算運行用戶的pod在master上,修改文件“config-install/manifests/cluster-scheduler-02-config.yml”,將參數(shù)“mastersSchedulable”設置為“false”,保存并退出。
2,生成ignition配置文件,該文件完成CoreOS的定制(Openshift 4集群的主機都必須運行CoreOS)

 ./openshift-install create ignition-configs --dir=config-install

執(zhí)行后,安裝程序在目錄“config-install”生成ignitions文件(manifests文件會被消耗掉)
五,搭建PXE安裝環(huán)境
在得到ignition文件和系統(tǒng)鏡像文件后,配置PXE安裝環(huán)境。ignition、kernel、initrd等通過http供集群主機下載,所以首先需要配置http服務器和tftp服務器
1,TFTP服務器
對tftp的配置主要是兩個部分:
PXE的Bootfile要使用“l(fā)pxelinux.0”,這樣才可以使用http協(xié)議;
pxelinux.cfg配置:

# D-I config version 2.0
# search path for the c32 support libraries (libcom32, libutil etc.)
path debian-installer/amd64/boot-screens/
include debian-installer/amd64/boot-screens/menu.cfg
default debian-installer/amd64/boot-screens/vesamenu.c32
prompt 0
timeout 0

label fedora-coreos-bootstrap
  KERNEL http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-kernel-x86_64
  APPEND ip=dhcp initrd=http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-initramfs.x86_64.img \
  console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/sda \
  coreos.inst.ignition_url=http://10.1.95.10:8000/bootstrap.ign \
  coreos.inst.install_dev=/dev/sda \
  coreos.live.rootfs_url=http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-rootfs.x86_64.img

label fedora-coreos-master
  KERNEL http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-kernel-x86_64
  APPEND ip=dhcp initrd=http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-initramfs.x86_64.img \
  console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/sda \
  coreos.inst.ignition_url=http://10.1.95.10:8000/master.ign \
  coreos.inst.install_dev=/dev/sda \
  coreos.live.rootfs_url=http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-rootfs.x86_64.img

label fedora-coreos-worker
  KERNEL http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-kernel-x86_64
  APPEND ip=dhcp initrd=http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-initramfs.x86_64.img \
  console=tty0 console=ttyS0 coreos.inst.install_dev=/dev/sda \
  coreos.inst.ignition_url=http://10.1.95.10:8000/worker.ign \
  coreos.inst.install_dev=/dev/sda \
  coreos.live.rootfs_url=http://10.1.95.10:8000/fedora-coreos-32.20200923.3.0-live-rootfs.x86_64.img

2,HTTP服務器
筆者使用nginx來配置HTTP服務,nginx的配置文件沒有什么好展示的,將所需的鏡像文件放到“/var/www/html”目錄即可,集群主機在PXE安裝環(huán)節(jié)會請求這些文件,

aneirin@vm-1:/var/www/html$ ls -lh
total 732M
-rwxrwxrwx 1 root root 297K Oct 16 15:32 bootstrap.ign
-rwxrwxrwx 1 root root  70M Oct 15 10:44 fedora-coreos-32.20200923.3.0-live-initramfs.x86_64.img
-rwxrwxrwx 1 root root  12M Oct 15 10:44 fedora-coreos-32.20200923.3.0-live-kernel-x86_64
-rwxrwxrwx 1 root root 651M Oct 15 10:45 fedora-coreos-32.20200923.3.0-live-rootfs.x86_64.img
-rwxrwxrwx 1 root root  11K Sep  5  2019 index.html //nginx自帶的文件
-rwxrwxrwx 1 root root  612 Apr 22 11:50 index.nginx-debian.html //nginx自帶的文件
-rwxrwxrwx 1 root root 1.9K Oct 16 15:32 master.ign
-rwxrwxrwx 1 root root 1.9K Oct 16 15:32 worker.ign

六,集群安裝
PXE環(huán)境配置好之后,就可以為集群的主機安裝操作系統(tǒng),七臺主機逐次安裝即可(bootstrap->master->worker),不用刻意等待一臺安裝好,再安裝另一臺,直接同時裝也是沒問題的。
使用命令“./openshift-install --dir=<installation_directory> wait-for bootstrap-complete --log-level=info”查看bootstrap的過程,當提示remove bootstrap時,從haproxy的配置文件中移除bootstrap相關(guān)配置即可,bootstrap主機的使命就完成了。
1,配置登錄憑據(jù)

export KUBECONFIG=<installation_directory>/auth/kubeconfig

可以直接寫在“~/.bashrc”中,下次登錄Shell,KUBECONFIG環(huán)境變量是一直存在的
2,連接集群Approving CSR
上步配置好后,就能以用戶“system:admin”連接集群,該用戶對集群有超級管理的權(quán)限(集群安裝完成后建議禁用該賬戶,它是一個安全隱患),我們需要對一些對象生成的CSR做Approve操作,這樣組件安裝才能繼續(xù)進行,

oc get csr //查看需要Approve的CSR
oc adm certificate approve <csr_name> //Approve指定的CSR

操作完成后,輸出如下,


Approving CSR

3,等待clusteroperators安裝完成
OKD集群基礎設施組件嚴重依賴各類“clusteroperators”,需要等待“AVAILABLE”列全部變?yōu)椤癟rue”

aneirin@host-1:~$ oc get clusteroperators
NAME                                       VERSION                         AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.5.0-0.okd-2020-10-03-012432   True        False         False      3h22m
cloud-credential                           4.5.0-0.okd-2020-10-03-012432   True        False         False      3h55m
cluster-autoscaler                         4.5.0-0.okd-2020-10-03-012432   True        False         False      3h29m
config-operator                            4.5.0-0.okd-2020-10-03-012432   True        False         False      3h29m
console                                    4.5.0-0.okd-2020-10-03-012432   True        False         False      3h24m
csi-snapshot-controller                    4.5.0-0.okd-2020-10-03-012432   True        False         False      3h33m
dns                                        4.5.0-0.okd-2020-10-03-012432   True        False         False      3h41m
etcd                                       4.5.0-0.okd-2020-10-03-012432   True        False         False      3h42m
image-registry                             4.5.0-0.okd-2020-10-03-012432   True        False         False      3h36m
ingress                                    4.5.0-0.okd-2020-10-03-012432   True        False         False      3h27m
insights                                   4.5.0-0.okd-2020-10-03-012432   True        False         False      3h36m
kube-apiserver                             4.5.0-0.okd-2020-10-03-012432   True        False         False      3h42m
kube-controller-manager                    4.5.0-0.okd-2020-10-03-012432   True        False         False      3h42m
kube-scheduler                             4.5.0-0.okd-2020-10-03-012432   True        False         False      3h40m
kube-storage-version-migrator              4.5.0-0.okd-2020-10-03-012432   True        False         False      3h27m
machine-api                                4.5.0-0.okd-2020-10-03-012432   True        False         False      3h35m
machine-approver                           4.5.0-0.okd-2020-10-03-012432   True        False         False      3h40m
machine-config                             4.5.0-0.okd-2020-10-03-012432   True        False         False      3h28m
marketplace                                4.5.0-0.okd-2020-10-03-012432   True        False         False      3h35m
monitoring                                 4.5.0-0.okd-2020-10-03-012432   True        False         False      3h25m
network                                    4.5.0-0.okd-2020-10-03-012432   True        False         False      3h44m
node-tuning                                4.5.0-0.okd-2020-10-03-012432   True        False         False      3h44m
openshift-apiserver                        4.5.0-0.okd-2020-10-03-012432   True        False         False      3h27m
openshift-controller-manager               4.5.0-0.okd-2020-10-03-012432   True        False         False      3h34m
openshift-samples                          4.5.0-0.okd-2020-10-03-012432   True        False         False      3h24m
operator-lifecycle-manager                 4.5.0-0.okd-2020-10-03-012432   True        False         False      3h43m
operator-lifecycle-manager-catalog         4.5.0-0.okd-2020-10-03-012432   True        False         False      3h43m
operator-lifecycle-manager-packageserver   4.5.0-0.okd-2020-10-03-012432   True        False         False      3h34m
service-ca                                 4.5.0-0.okd-2020-10-03-012432   True        False         False      3h44m
storage                                    4.5.0-0.okd-2020-10-03-012432   True        False         False      3h33m

4,為image-registry配置存儲
在非公有云平臺部署OKD4,image-registry沒有現(xiàn)成的存儲可用。在非生產(chǎn)環(huán)境,可以使用“emptyDir”來作為臨時存儲(重啟registry,鏡像會丟失,生產(chǎn)環(huán)境勿用),這樣就可以使用集群內(nèi)的本地鏡像倉庫,配置命令如下,

oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'

大功告成:

aneirin@host-1:~$ ./openshift-install --dir=config-install wait-for install-complete
INFO Waiting up to 30m0s for the cluster at https://api.okd-infra.wumi.ai:6443 to initialize... 
INFO Waiting up to 10m0s for the openshift-console route to be created... 
INFO Install complete!                            
INFO To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/aneirin/okd4/config-install/auth/kubeconfig' 
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.okd-infra.wumi.ai 
INFO Login to the console with user: "kubeadmin", and password: "CaEJY-myzAi-R7Wtj-XXXX" 
INFO Time elapsed: 1s

本篇只是OpenShift 4使用萬里長征的第一步,后面還有很多工作要做,比如monitoring、logging、storage等等,敬請期待!

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

友情鏈接更多精彩內(nèi)容