junos防火墻的雙機集群配置

概念

  • 機箱群集cluster-id(機箱集群中可以包含大量冗余組)
  • 節(jié)點node id
  • 冗余組
    • 決定冗余組是否為主冗余組的因素有三個:為節(jié)點配置的優(yōu)先級、節(jié)點 ID(節(jié)點 ID 0 號最低的節(jié)點始終優(yōu)先,如果優(yōu)先級難分高下)和節(jié)點的出現(xiàn)順序。 如果優(yōu)先級較低的節(jié)點首先出現(xiàn),則將其視為冗余組的主項(如果未啟用搶先,則將保持為主項)。

摘要

設(shè)備加入集群之后,即成為集群的一個節(jié)點。 除了唯一節(jié)點設(shè)置和管理 IP 地址之外,同一個集群中的節(jié)點共享相同的配置。

機箱群集概述

控制平面

  • 用來在節(jié)點之間同步配置和內(nèi)核狀態(tài)
  • 節(jié)點之間通過控制端口連接(注意哪個作為控制端口)
  • 以主動/備動模式運行,兩節(jié)點相互備份,一個當(dāng)主,一個當(dāng)輔,主設(shè)備出現(xiàn)故障,輔助設(shè)備將接管信息流的處理

數(shù)據(jù)平面

  • 通過結(jié)構(gòu)端口相連來形成一個統(tǒng)一的數(shù)據(jù)平面(注意哪個作為結(jié)構(gòu)端口)
  • 用來同步流經(jīng)各個節(jié)點信息流的會話信息,從而確保執(zhí)行故障切換時不會丟棄建立的會話
  • 數(shù)據(jù)平面軟件以主動/主動模式運行

集群節(jié)點的不同狀態(tài)

  • hold(等待)
  • primary(主)
  • secondary-hold(輔助-等待)
  • secondary(輔助)
  • ineligible(無資格)
  • diabled(禁用)

配置機箱群集前要注意的事項

  • 節(jié)點的硬件軟件要一致
  • 節(jié)點要先設(shè)置root-authentication密碼,而且密碼要一致
  • 管理控制口不能有任何配置,否則有可能會因為控制口被占用導(dǎo)致通信不了而失敗。如果不知道哪個是管理控制口,可以先恢復(fù)出廠設(shè)置(配置模式下: load factory-default),然后( run show configuration |display set|match interface )檢查是否還有包含interface,有的話,用delete命令刪除掉

下面的錯誤是沒有設(shè)置密碼引起

root# commit 
[edit]
  'system'
    Missing mandatory statement: 'root-authentication'
error: commit failed: (missing statements)
這里提示沒有配置根認證,這是由于第一次登陸Junos密碼為空,配置root密碼后再進行commit操作:

[edit]
root# set system root-authentication plain-text-password   
New password:
Retype new password:

[edit]
root# commit 
commit complete

[edit]

配置步驟

  • 先對照前面的“配置機箱群集前要注意的事項”,確保滿足

1.設(shè)置root-authentication

這一步不需要命名主機

  • srx-a
root# set system root-authentication plain-text-password   
New password:
Retype new password:

[edit]
root# commit 
commit complete

  • srx-b
root# set system root-authentication plain-text-password 
New password:
Retype new password:

[edit]
root# commit 
commit complete

2. 設(shè)置chassis cluster

分別在節(jié)點的cli模式下執(zhí)行下面的命令,注意,命令(node)是不一樣的

srx-a

root> set chassis cluster cluster-id 1 node 0 reboot 

srx-b

root> set chassis cluster cluster-id 1 node 1 reboot 

驗證cluster

  • show chassis cluster status

異常情況

如果管理控制口被占用或者由于其它原因?qū)е虏荒芡?,會出現(xiàn)下面這種情況。

  • 節(jié)點srx-a
root> show chassis cluster status   
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  1        primary        no      no       None           
node1  0        lost           n/a     n/a      n/a            

{primary:node0}
  • 節(jié)點srx-b 同樣看不到對方的狀態(tài)
root> show chassis cluster status 
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  0        lost           n/a     n/a      n/a            
node1  1        primary        no      no       None           

注意到觀察倆節(jié)點node的狀態(tài),如果出現(xiàn)這種(lost)情況,一般是由于控制口被占用導(dǎo)致兩節(jié)點同步不了的,刪除控制可的配置即可解決

正常情況

  • 節(jié)點node0
root> show chassis cluster status 
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  1        primary        no      no       None           
node1  1        secondary      no      no       None           

{primary:node0}

  • 節(jié)點node1
root@srx-b> show chassis cluster status        
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 0
node0  1        primary        no      no       None           
node1  1        secondary      no      no       None           

{secondary:node1}

測試接管

  • 將主重起,備機馬上接管


    quicker_b64e97e8-91c3-44a0-bf79-e1039ef137ab.png
  • node0 重啟完后,并不會自動切換為主,會處于“hold”然后變成"secondary"

root@srx-b> show chassis cluster status    
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  1        hold           no      no       None           
node1  1        primary        no      no       None 

root@srx-b> show chassis cluster status    
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  1        secondary      no      no       None           
node1  1        primary        no      no       None 

手動切換主備

  • cli模式下命令request chassis cluster failover 要指定切換的是哪個redundancy-group和哪個節(jié)點node為主
root> request chassis cluster failover ?  
Possible completions:
  node                 Node identifier of the new primary (0..1)
  redundancy-group     Redundancy-group identifier (0..63)
  reset                Undo the previous failover command

root> request chassis cluster failover node 0 redundancy-group 0  
node0:
--------------------------------------------------------------------------
Initiated manual failover for redundancy group 0

{secondary:node0}
root> show chassis cluster status  
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  255      primary        no      yes      None           
node1  1        secondary-hold no      yes      None           

注意到這時node0 的priority為255。可以使用這個命令reset為1,(一般發(fā)生故障切換后,reset可以恢復(fù)原定的主備priority)

root> request chassis cluster failover reset redundancy-group 0 
node0:
--------------------------------------------------------------------------
Successfully reset manual failover for redundancy group 0

node1:
--------------------------------------------------------------------------
No reset required for redundancy group 0.

{primary:node0}
root> show chassis cluster status  
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring
 
Cluster ID: 1
Node   Priority Status         Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  1        primary        no      no       None           
node1  1        secondary      no      no       None           

{primary:node0}

怎樣取消雙機集群配置

有兩種方法:都是在操作模式(cli)下。

  • set chassis cluster disable reboot 直接關(guān)
  • set chassis cluster cluster-id 0 node 1 reboot 這種id 為0 時也會關(guān)掉。重啟就可以了

配置過程中遇到的問題

問題1

root@srx-a# set groups node0 system host-name srx-A  

{primary:node0}[edit]
root@srx-a# set groups node1 system host-name srx-B 

{primary:node0}[edit]
root@srx-a# commit 
[edit interfaces]
  'ge-0/0/0'
     HA management port cannot be configured
error: configuration check-out failed

{primary:node0}[edit]

解決方法:
When clustering is enabled ge-0/0/0 become fxp0(management interface) and ge-0/0/1 become fxp1 (control link).

https://kb.juniper.net/InfoCenter/index?page=content&id=KB15356

問題2

root@srx-a# commit 
[edit security zones security-zone untrust]
  'interfaces ge-0/0/0.0'
    Interface ge-0/0/0.0 must be configured under interfaces
error: configuration check-out failed

原因是Interface ge-0/0/0.0不存在(Look into show interfaces and see if a ge-0/0/0 unit 0 is configured there !

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容