NameNode HA異常調(diào)查

【起因】22:31:04分被通知集群異常,無法正常處理調(diào)度任務(wù);遠(yuǎn)程登錄查看,發(fā)現(xiàn)HDFS NameNode發(fā)生腦裂;

NameNode節(jié)點:name21.hadoop,name22.hadoop
hadoop版本:2.7.1

【事故報告】

1. 【時間】=> 22:31:04
【HA狀態(tài)】=> name21.hadoop:Active, name22.hadoop:Standyby
【日志信息】=> name21.hadoop:/var/log/hadoop/hdfs/gc.log-xxxxxxxxxx

此時,Active NameNode觸發(fā)一次異常JVM GC(正常情況下GC耗時在2s以下),此次耗時32.46s;

2018-02-11T22:31:04.707+0800: 801030.875: [GC2018-02-11T22:31:04.707+0800: 801030.875: [ParNew (promotion failed): 1550355K->1544189K(1710528K), 0.8542820 secs]
2018-02-11T22:31:05.562+0800: 801031.729: [CMS: 12100861K->3636381K(13303808K), 31.6119450 secs] 13650016K->3636381K(15014336K), [CMS Perm : 55251K->55008K(131072K)], 32.4666280 secs] [Times: user=33.04 sys=0.00, **real=32.46 secs**]

2.【時間】=> 22:31:37
【HA狀態(tài)】=> name21.hadoop:Active, name22.hadoop:Standyby
【日志信息】=> name21.hadoop:/var/log/hadoop/hdfs/hadoop-hdfs-namenode-name21.hadoop.log

此時,由于之前GC時間過長,導(dǎo)致active NameNode向JournalNodes進(jìn)行數(shù)據(jù)通信時,超時導(dǎo)致無法響應(yīng)(超時時間默認(rèn)值為20s,GC耗時32s);

2018-02-11 22:31:37,201 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(398)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [xxx.xx.xx.xx:8485, xxx.xx.xx.xx:8485, xxx.xx.xx.xx:8485], stream=QuorumOutputStream starting at txid 3108262584)) java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond. 
2018-02-11 22:31:37,208 WARN util.JvmPauseMonitor (JvmPauseMonitor.java:run(191)) - Detected pause in JVM or host machine (eg GC): pause of approximately 32158ms GC pool 'ParNew' had collection(s): count=1 time=855ms GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=31612ms 
2018-02-11 22:31:37,211 INFO namenode.FSNamesystem (FSNamesystem.java:rollEditLog(5937)) - Roll Edit Log from 172.21.54.22 
2018-02-11 22:31:37,211 INFO namenode.FSEditLog (FSEditLog.java:rollEditLog(1202)) - Rolling edit logs 2018-02-11 22:31:37,211 INFO namenode.FSEditLog (FSEditLog.java:endCurrentLogSegment(1258)) - Ending log segment 3108262584 
2018-02-11 22:31:37,228 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1

至此,Active Namenode節(jié)點進(jìn)入異常狀態(tài)(進(jìn)程無法結(jié)束,程序所起8021端口不存在);

3.【時間】=> 22:32:22
【HA狀態(tài)】=> name21.hadoop:Active, name22.hadoop:Standyby
【日志信息】=> name21.hadoop:/var/log/hadoop/hdfs/hadoop-hdfs-zkfc-name21.hadoop.log

此時, name21.hadoop上的ZKFailoverController無法獲取到本機(jī)Namenode的狀態(tài)[無法將Active狀態(tài)修改為Standby],此后每隔45s反復(fù)重試;同時,與zookeeper的session連接關(guān)閉;

2018-02-11 22:32:22,267 INFO ha.HealthMonitor (HealthMonitor.java:enterState(249)) - Entering state SERVICE_NOT_RESPONDING 
2018-02-11 22:32:22,267 INFO ha.ZKFailoverController (ZKFailoverController.java:setLastHealthState(852)) - Local service NameNode at name21.hadoop/xxx.xx.xx.xx:8021 entered state: SERVICE_NOT_RESPONDING 
2018-02-11 22:32:22,267 INFO ha.ZKFailoverController (ZKFailoverController.java:recheckElectability(768)) - Quitting master election for NameNode at name21.hadoop2/ xxx.xx.xx.xx:8021 and marking that fencing is necessary 
2018-02-11 22:32:22,267 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:quitElection(401)) - Yielding from election 
2018-02-11 22:32:22,268 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:terminateConnection(830)) - Terminating ZK connection for elector id=660690290 appData=0a06636261734e4112036e6e311a0e6e616d6532312e6861646f6f703220d53e28d33e cb=Elector callbacks for NameNode at name21.hadoop2/172.21.54.21:8021 
2018-02-11 22:32:22,270 INFO zookeeper.ZooKeeper (ZooKeeper.java:close(684)) - Session: 0x15f5451238a08b3 closed 
2018-02-11 22:32:22,271 WARN ha.ActiveStandbyElector (ActiveStandbyElector.java:isStaleClient(1066)) - Ignoring stale result from old client with sessionId 0x15f5451238a08b3 
2018-02-11 22:32:22,271 INFO zookeeper.ClientCnxn (ClientCnxn.java:run(524)) - EventThread shut down  

4.【時間】=> 22:32:22
【HA狀態(tài)】=> name21.hadoop:Active, name22.hadoop:Standyby
【日志信息】=> name22.hadoop:/var/log/hadoop/hdfs/hadoop-hdfs-zkfc-name22.hadoop.log

與此同時, name22.hadoop的namenode與zookeeper建立session連接, ZKFailoverController 嘗試連接name21.hadoop上的Namenode,將狀態(tài)從Active修改為Standby, 報錯連接超時;

2018-02-11 22:32:22,286 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:fenceOldActive(931)) - Checking for any old active which needs to be fenced... 
2018-02-11 22:32:22,290 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:fenceOldActive(952)) - Old node exists: 0a06636261734e4112036e6e311a0e6e616d6532312e6861646f6f703220d53e28d33e
2018-02-11 22:32:22,294 INFO ha.ZKFailoverController (ZKFailoverController.java:doFence(513)) - Should fence: NameNode at name21.hadoop/ xxx.xx.xx.xx:8021 
2018-02-11 22:32:27,304 WARN ha.FailoverController (FailoverController.java:tryGracefulFence(178)) - Unable to gracefully make NameNode at name21.hadoop/ xxx.xx.xx.xx:8021 standby (unable to connect) java.net.SocketTimeoutException: Call From name22.hadoop/ xxx.xx.xx.xx to name21.hadoop:8021 failed on socket timeout exception: java.net.SocketTimeoutException: 5000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/ xxx.xx.xx.xx :56511remote=name21.hadoop/ xxx.xx.xx.xx :8021];  
......
Caused by: **java.net.SocketTimeoutException: 5000 millis timeout** while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/ xxx.xx.xx.xx :56511 remote=name21.hadoop/ xxx.xx.xx.xx :8021] 
......

啟動Fence服務(wù)(集群并未設(shè)置,直接返回ture),并將name22.hadoop Namenode的狀態(tài)從Standby變?yōu)锳ctive;

2018-02-11 22:32:27,323 INFO ha.NodeFencer (NodeFencer.java:fence(91)) - ====== Beginning Service Fencing Process... ====== 
2018-02-11 22:32:27,323 INFO ha.NodeFencer (NodeFencer.java:fence(94)) - Trying method 1/1: org.apache.hadoop.ha.ShellCommandFencer(/bin/true) 
2018-02-11 22:32:27,339 INFO ha.ShellCommandFencer (ShellCommandFencer.java:tryFence(99)) - Launched fencing command '/bin/true' with pid 48839 
2018-02-11 22:32:27,359 INFO ha.NodeFencer (NodeFencer.java:fence(98)) - ====== Fencing successful by method org.apache.hadoop.ha.ShellCommandFencer(/bin/true) ====== 
2018-02-11 22:32:27,360 INFO ha.ActiveStandbyElector (ActiveStandbyElector.java:writeBreadCrumbNode(878)) - Writing znode /hadoop-ha/cbasNA/ActiveBreadCrumb to indicate that the local node is the most recent active... 
2018-02-11 22:32:27,365 INFO ha.ZKFailoverController (ZKFailoverController.java:becomeActive(380)) - Trying to make NameNode at name22.hadoop/ xxx.xx.xx.xx :8021 active... 
2018-02-11 22:32:46,337 INFO ha.ZKFailoverController (ZKFailoverController.java:becomeActive(387)) - Successfully transitioned NameNode at name22.hadoop/ xxx.xx.xx.xx :8021 to active state

5.【時間】=> 22:32:46
【HA狀態(tài)】=> name21.hadoop:Active, name22.hadoop:Active
至此,集群的NameNode出現(xiàn)雙Active狀態(tài),發(fā)生腦裂;

6.【恢復(fù)措施】
后面運(yùn)維介入時,嘗試重啟namenode,發(fā)現(xiàn)一直卡在35%進(jìn)度,無法在執(zhí)行下去;

原因:運(yùn)維介入時,首先嘗試重啟name22.hadoop上的namenode,此時重啟程序檢測到雙active,程序會反復(fù)獲取狀態(tài),直到active只有一個;

NameNode HA states: active_namenodes = [('nn1', 'name21.hadoop:50070'), ('nn2', 'name22.hadoop:50070')], standby_namenodes = [], unknown_namenodes = []

此時出現(xiàn)的情況是:
name21.hadoop的namenode異常(8021端口連接超時),無法將自身狀態(tài)從active修改為standby;
name22.hadoop的namenode已經(jīng)被ZKFailoverController選舉為active,選舉理由也是name21.hadoop的8021端口連接異常;

因此,需要重啟name21.hadoop上的namenode(恢復(fù)8021端口),重啟時會檢測name22.hadoop的namenode狀態(tài)(檢測name22.hadoop的8021端口),將自身從active切換為standby;name22.hadoop無需重啟;

7.【預(yù)防措施】
a. 優(yōu)化GC參數(shù),添加namenode的GC監(jiān)控,找出GC異常的原因并修復(fù);
b. 提高journalNode的寫入超時時間,目前默認(rèn)值是20s,對于生產(chǎn)環(huán)境有必要提高到60s;
c. 啟動Fence服務(wù);
d. 集群namenode節(jié)點的狀態(tài)監(jiān)控;

博客主頁:http://www.itdecent.cn/u/e97bb429f278

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 第七章家教困惑 166. 孩子成才還是需要天分的吧? 如果我們認(rèn)為佳佳沒有天分,不讀經(jīng)會怎么樣?有天分,人家就讀得...
    俊媽李利閱讀 452評論 0 1
  • 致良知線上學(xué)習(xí)每日功課 姓名:吳蕙伶 編號:3組 日期:2017年3月24日 今日學(xué)習(xí)課文:《教條示龍場諸生》 課...
    BoBoShi閱讀 1,005評論 0 2
  • chapter 1 我從遠(yuǎn)方歸來,一路狂奔,帶著滿身異鄉(xiāng)的塵埃,為趕赴一場花的約會。 車程不遠(yuǎn)不近,隨身攜帶的書讓...
    跟你要范閱讀 314評論 0 0
  • 親愛的,如果你對自己的現(xiàn)狀不滿意,想改變成自己理想中的樣子,可以從以下這三件事情開始 1換一個新的造型 你理想中的...
    融冰的美食日記閱讀 654評論 0 2
  • 大火 燒吧 放肆的燒吧 就算你把月亮熔成太陽 就算你把夜晚化成白天 就算你把海洋蒸成干地 就算你把沙石熱成塵埃 就...
    同行之人閱讀 497評論 2 5

友情鏈接更多精彩內(nèi)容