Linux HA 集群原理和配置-01

CentOS7上可通過三大模塊corosync(心跳管理),pacemaker(資源管理),pcs(配置工具)來實(shí)現(xiàn)最基本的高可用性集群功能,本文將介紹這些工具的工作原理和配置過程。

1. 實(shí)驗(yàn)環(huán)境

  • 虛擬化軟件:

  • 實(shí)驗(yàn)虛機(jī):

    • iscsi-disks: 192.168.56.20,通過iSCSI協(xié)議提供共享存儲(chǔ),默認(rèn)配置1個(gè)cpu,1G內(nèi)存。
    • ha-host1: 192.168.56.31,默認(rèn)配置1個(gè)cpu,1G內(nèi)存。
    • ha-host2: 192.168.56.32,默認(rèn)配置1個(gè)cpu,1G內(nèi)存。
    • ha-host3: 192.168.56.33,默認(rèn)配置1個(gè)cpu,1G內(nèi)存。
  • 安裝和管理網(wǎng)絡(luò):192.168.56.0/24,該網(wǎng)絡(luò)為VirtualBox的Host-Only網(wǎng)絡(luò),支持物理機(jī)和VirtualBox虛機(jī)間的互相訪問。

2. 克隆項(xiàng)目并啟動(dòng)上述虛擬機(jī)

$ git clone https://github.com/lprincewhn/iscsi.git
$ cd iscsi
$ vagrant up iscsi-disks
$ cd ..
$ git clone https://github.com/lprincewhn/linuxha.git
$ cd LinuxHA
$ vagrant up

虛擬機(jī)啟動(dòng)完畢后可使用以下用戶登陸:

  • root/vagrant
  • vagrant/vagrant

3. 創(chuàng)建pcs集群

Step 1 安裝軟件包

在3臺(tái)集群主機(jī)上安裝corosync,pacemaker,pcs包, 啟動(dòng)pcsd服務(wù)。

# yum -y install corosync pacemaker pcs
# systemctl start pcsd && systemctl enable pcsd

Step 2 修改hacluster用戶密碼

在3臺(tái)集群主機(jī)上修改用戶hacluster的密碼,3臺(tái)主機(jī)的密碼保持一致。這個(gè)用戶僅用于集群主機(jī)間通信,無法登陸系統(tǒng)。因此該密碼僅在主機(jī)集群認(rèn)證時(shí)一次性使用。

[root@ha-host1 ~]# passwd hacluster
Changing password for user hacluster.
New password:
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.

Step 3 互相認(rèn)證集群主機(jī)

[root@ha-host1 ~]# pcs cluster auth ha-host1 ha-host2 ha-host3
Username: hacluster
Password:
ha-host1: Authorized
ha-host2: Authorized
ha-host3: Authorized
[root@ha-host1 ~]#

Step 4 創(chuàng)建集群

[root@ha-host1 ~]# pcs cluster setup --name linuxha ha-host1 ha-host2 ha-host3
Destroying cluster on nodes: ha-host1, ha-host2, ha-host3...
ha-host3: Stopping Cluster (pacemaker)...
ha-host1: Stopping Cluster (pacemaker)...
ha-host2: Stopping Cluster (pacemaker)...
ha-host2: Successfully destroyed cluster
ha-host1: Successfully destroyed cluster
ha-host3: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'ha-host1', 'ha-host2', 'ha-host3'
ha-host2: successful distribution of the file 'pacemaker_remote authkey'
ha-host1: successful distribution of the file 'pacemaker_remote authkey'
ha-host3: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
ha-host1: Succeeded
ha-host2: Succeeded
ha-host3: Succeeded

Synchronizing pcsd certificates on nodes ha-host1, ha-host2, ha-host3...
ha-host1: Success
ha-host2: Success
ha-host3: Success
Restarting pcsd on the nodes in order to reload the certificates...
ha-host1: Success
ha-host2: Success
ha-host3: Success

Step 5 啟動(dòng)集群并設(shè)置自動(dòng)啟動(dòng)

[root@ha-host1 ~]# pcs cluster start --all
ha-host1: Starting Cluster...
ha-host2: Starting Cluster...
ha-host3: Starting Cluster...
[root@ha-host1 ~]# pcs cluster enable --all
ha-host1: Cluster Enabled
ha-host2: Cluster Enabled
ha-host3: Cluster Enabled

Step 6 查看集群狀態(tài)

[root@ha-host1 ~]# pcs status
Cluster name: linuxha
WARNING: no stonith devices and stonith-enabled is not false
Stack: corosync
Current DC: ha-host3 (version 1.1.16-12.el7_4.8-94ff4df) - partition with quorum
Last updated: Mon Apr 23 03:39:50 2018
Last change: Mon Apr 23 03:38:08 2018 by hacluster via crmd on ha-host3

3 nodes configured
0 resources configured

Online: [ ha-host1 ha-host2 ha-host3 ]

No resources


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

pcs status的結(jié)果顯示:

  1. 當(dāng)前的仲裁節(jié)點(diǎn)(Current DC, DC即為Designated Co-ordinator)為ha-host3,這個(gè)節(jié)點(diǎn)負(fù)責(zé)向集群中的節(jié)點(diǎn)發(fā)出一系列指令,使各個(gè)資源按照定義(存儲(chǔ)在cib數(shù)據(jù)庫中)啟動(dòng)或停止。

  2. 3臺(tái)集群主機(jī)ha-host1,ha-host2,ha-host3都已經(jīng)online,但是目前沒有任何資源。

  3. 有一個(gè)關(guān)于stonith設(shè)備的Warning:

    WARNING: no stonith devices and stonith-enabled is not false

    stonith設(shè)備在后面將會(huì)介紹,目前因?yàn)檫€沒有創(chuàng)建,因此先向stonith-enabled屬性設(shè)為false。關(guān)閉后該WARNING消失。

    [root@ha-host1 ~]# pcs property set stonith-enabled=false
    [root@ha-host1 ~]# pcs status
    Cluster name: linuxha
    Stack: corosync
    Current DC: ha-host3 (version 1.1.16-12.el7_4.8-94ff4df) - partition with quorum
    Last updated: Mon Apr 23 03:40:17 2018
    Last change: Mon Apr 23 03:40:15 2018 by root via cibadmin on ha-host1
    
    3 nodes configured
    0 resources configured
    
    Online: [ ha-host1 ha-host2 ha-host3 ]
    
    No resources
    
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    

4. 創(chuàng)建資源

創(chuàng)建一個(gè)最簡單的IP資源:

[root@ha-host1 ~]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=192.168.56.24 cidr_netmask=24 op monitor interval=30s

查看pcs的狀態(tài):

[root@ha-host1 ~]# pcs status
Cluster name: linuxha
Stack: corosync
Current DC: ha-host3 (version 1.1.16-12.el7_4.8-94ff4df) - partition with quorum
Last updated: Mon Apr 23 03:41:46 2018
Last change: Mon Apr 23 03:41:40 2018 by root via cibadmin on ha-host1

3 nodes configured
1 resource configured

Online: [ ha-host1 ha-host2 ha-host3 ]

Full list of resources:

 vip    (ocf::heartbeat:IPaddr2):       Started ha-host1

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

上面顯示新創(chuàng)建的資源vip,這個(gè)IP被分配在ha-host1上。

檢查ha-host1的網(wǎng)口,可以看到ha-host1的eth1網(wǎng)口上分配了新的IP 192.168.56.24

[root@ha-host1 ~]# ip a
...
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:55:a1:19 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.21/24 brd 192.168.56.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet 192.168.56.24/24 brd 192.168.56.255 scope global secondary eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe55:a119/64 scope link
       valid_lft forever preferred_lft forever
[root@ha-host1 ~]# ssh 192.168.56.24
Last login: Wed Apr 18 05:38:11 2018 from 192.168.56.21
[root@ha-host1 ~]# exit
logout
Connection to 192.168.56.24 closed.

5. 觸發(fā)切換

將ha-host1的eth1網(wǎng)口停掉,可發(fā)現(xiàn)192.168.56.24這個(gè)IP被切換到了ha-host2的eth1端口。

[root@ha-host1 ~]# ifconfig eth1 down
[root@ha-host2 ~]# pcs status
Cluster name: linuxha
Stack: corosync
Current DC: ha-host3 (version 1.1.16-12.el7_4.8-94ff4df) - partition with quorum
Last updated: Mon Apr 23 03:43:36 2018
Last change: Mon Apr 23 03:41:39 2018 by root via cibadmin on ha-host1

3 nodes configured
1 resource configured

Online: [ ha-host2 ha-host3 ]
OFFLINE: [ ha-host1 ]

Full list of resources:

 vip    (ocf::heartbeat:IPaddr2):       Started ha-host2

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
[root@ha-host2 ~]# ip a
...
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 08:00:27:bc:94:42 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.22/24 brd 192.168.56.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet 192.168.56.24/24 brd 192.168.56.255 scope global secondary eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:febc:9442/64 scope link
       valid_lft forever preferred_lft forever

上面pcs status的結(jié)果顯示了ha-host1變?yōu)榱薕FFLINE狀態(tài),這是因?yàn)閑th1也是ha-host1的集群間的通信網(wǎng)口。在實(shí)際部署中,虛擬IP資源一般用于承載業(yè)務(wù),應(yīng)該和集群通信用的網(wǎng)絡(luò)分開。

6. Pacemaker中的資源定義

Pacemaker中的資源類型使用standard, provider(僅當(dāng)standard為ocf使用)和agent來進(jìn)行標(biāo)識,格式如下:
<standard>:[provider]:<agent>

可用pcs resources list指令列出當(dāng)前支持的資源類型:

[root@ha-host1 ~]# pcs resource list
...
ocf:heartbeat:iface-vlan - Manages VLAN network interfaces.
ocf:heartbeat:IPaddr - Manages virtual IPv4 and IPv6 addresses (Linux specific
                       version)
ocf:heartbeat:IPaddr2 - Manages virtual IPv4 and IPv6 addresses (Linux specific
                        version)
ocf:heartbeat:IPsrcaddr - Manages the preferred source address for outgoing IP
                          packets
ocf:heartbeat:iSCSILogicalUnit - Manages iSCSI Logical Units (LUs)
ocf:heartbeat:iSCSITarget - iSCSI target export agent
ocf:heartbeat:LVM - Controls the availability of an LVM Volume Group
ocf:heartbeat:MailTo - Notifies recipients by email in the event of resource
                       takeover
ocf:heartbeat:mysql - Manages a MySQL database instance
ocf:heartbeat:nagios - Nagios resource agent
...

以上輸出中可以找到之前創(chuàng)建的vip資源類型 ocf:heartbeat:IPaddr2。

然后使用pcs resource describe指令查看該類型資源所需參數(shù)

[root@ha-host1 ~]# pcs resource describe ocf:heartbeat:IPaddr2  
ocf:heartbeat:IPaddr2 - Manages virtual IPv4 and IPv6 addresses (Linux specific
                        version)

This Linux-specific resource manages IP alias IP addresses.
It can add an IP alias, or remove one.
In addition, it can implement Cluster Alias IP functionality
if invoked as a clone resource.

If used as a clone, you should explicitly set clone-node-max >= 2,
and/or clone-max < number of nodes. In case of node failure,
clone instances need to be re-allocated on surviving nodes.
This would not be possible if there is already an instance on those nodes,
and clone-node-max=1 (which is the default).

Resource options:
  ip (required): The IPv4 (dotted quad notation) or IPv6 address (colon
                 hexadecimal notation) example IPv4 "192.168.1.1". example IPv6
                 "2001:db8:DC28:0:0:FC57:D4C8:1FFF".
  nic: The base network interface on which the IP address will be brought
       online. If left empty, the script will try and determine this from the
       routing table. Do NOT specify an alias interface in the form eth0:1 or
       anything here; rather, specify the base interface only. If you want a
       label, see the iflabel parameter. Prerequisite: There must be at least
       one static IP address, which is not managed by the cluster, assigned to
       the network interface. If you can not assign any static IP address on the
       interface, modify this kernel parameter: sysctl -w
       net.ipv4.conf.all.promote_secondaries=1 # (or per device)
  cidr_netmask: The netmask for the interface in CIDR format (e.g., 24 and not
                255.255.255.0) If unspecified, the script will also try to
                determine this from the routing table.
  broadcast: Broadcast address associated with the IP. If left empty, the script
             will determine this from the netmask.
  iflabel: You can specify an additional label for your IP address here. This
           label is appended to your interface name. A label can be specified in
           nic parameter but it is deprecated. If a label is specified in nic
           name, this parameter has no effect.
  lvs_support: Enable support for LVS Direct Routing configurations. In case a
               IP address is stopped, only move it to the loopback device to
               allow the local node to continue to service requests, but no
               longer advertise it on the network. Notes for IPv6: It is not
               necessary to enable this option on IPv6. Instead, enable
               'lvs_ipv6_addrlabel' option for LVS-DR usage on IPv6.
  lvs_ipv6_addrlabel: Enable adding IPv6 address label so IPv6 traffic
                      originating from the address's interface does not use this
                      address as the source. This is necessary for LVS-DR health
                      checks to realservers to work. Without it, the most
                      recently added IPv6 address (probably the address added by
                      IPaddr2) will be used as the source address for IPv6
                      traffic from that interface and since that address exists
                      on loopback on the realservers, the realserver response to
                      pings/connections will never leave its loopback. See
                      RFC3484 for the detail of the source address selection.
                      See also 'lvs_ipv6_addrlabel_value' parameter.
  lvs_ipv6_addrlabel_value: Specify IPv6 address label value used when
                            'lvs_ipv6_addrlabel' is enabled. The value should be
                            an unused label in the policy table which is shown
                            by 'ip addrlabel list' command. You would rarely
                            need to change this parameter.
  mac: Set the interface MAC address explicitly. Currently only used in case of
       the Cluster IP Alias. Leave empty to chose automatically.
  clusterip_hash: Specify the hashing algorithm used for the Cluster IP
                  functionality.
  unique_clone_address: If true, add the clone ID to the supplied value of IP to
                        create a unique address to manage
  arp_interval: Specify the interval between unsolicited ARP packets in
                milliseconds.
  arp_count: Number of unsolicited ARP packets to send at resource
             initialization.
  arp_count_refresh: Number of unsolicited ARP packets to send during resource
                     monitoring. Doing so helps mitigate issues of stuck ARP
                     caches resulting from split-brain situations.
  arp_bg: Whether or not to send the ARP packets in the background.
  arp_mac: MAC address to send the ARP packets to. You really shouldn't be
           touching this.
  arp_sender: The program to send ARP packets with on start. For infiniband
              interfaces, default is ipoibarping. If ipoibarping is not
              available, set this to send_arp.
  flush_routes: Flush the routing table on stop. This is for applications which
                use the cluster IP address and which run on the same physical
                host that the IP address lives on. The Linux kernel may force
                that application to take a shortcut to the local loopback
                interface, instead of the interface the address is really bound
                to. Under those circumstances, an application may, somewhat
                unexpectedly, continue to use connections for some time even
                after the IP address is deconfigured. Set this parameter in
                order to immediately disable said shortcut when the IP address
                goes away.
  run_arping: Whether or not to run arping for IPv4 collision detection check.
  preferred_lft: For IPv6, set the preferred lifetime of the IP address. This
                 can be used to ensure that the created IP address will not be
                 used as a source address for routing. Expects a value as
                 specified in section 5.5.4 of RFC 4862.

Default operations:
  start: interval=0s timeout=20s
  stop: interval=0s timeout=20s
  monitor: interval=10s timeout=20s

之前創(chuàng)建vip資源時(shí)使用的資源參數(shù)(ip,cidr_netmask)和操作參數(shù)(monitor:interval)都可在以上輸出中找到。

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請結(jié)合常識與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 本文介紹在Linux HA集群中的仲裁和分區(qū)概念。 集群正常工作時(shí),所有節(jié)點(diǎn)都在一個(gè)分區(qū)內(nèi)(partition),...
    簡暴老王閱讀 2,206評論 0 1
  • 本文介紹在Linux HA集群中的stonith模塊功能。 Stonith,全稱Shoot The Other N...
    簡暴老王閱讀 4,647評論 1 1
  • 本文整理了在實(shí)踐過程中使用的Linux網(wǎng)絡(luò)工具,這些工具提供的功能非常強(qiáng)大,我們平時(shí)使用的只是冰山一角,比如lso...
    老夫劉某閱讀 3,814評論 0 7
  • Spring Cloud為開發(fā)人員提供了快速構(gòu)建分布式系統(tǒng)中一些常見模式的工具(例如配置管理,服務(wù)發(fā)現(xiàn),斷路器,智...
    卡卡羅2017閱讀 136,578評論 19 139
  • 1. Zookeeper介紹: 1.基本介紹: Zookeeper: 為分布式應(yīng)用提供分布式協(xié)作(協(xié)調(diào))服務(wù)。使用...
    奉先閱讀 4,720評論 0 10

友情鏈接更多精彩內(nèi)容