假設(shè)我們有3臺(tái)虛擬機(jī),主機(jī)名分別是master、slave1和slave2。
這3臺(tái)虛擬機(jī)的Hadoop的HA集群部署如下:
| master | slave1 | slave2 |
|---|---|---|
| Zookeeper | Zookeeper | Zookeeper |
| JournalNode | JournalNode | JournalNode |
| NodeManager | NodeManager | NodeManager |
| DataNode | DataNode | DataNode |
| NameNode | NameNode | none |
| zkfc | zkfc | none |
| ResourceManager | none |
ResourceManager |
集群的啟動(dòng)流程如下:
- 分別在 3 臺(tái)機(jī)器上啟動(dòng) Zookeeper
[root@master ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.11/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@slave1 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.11/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[root@slave2 ~]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.11/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
- 啟動(dòng)Zookeeper之后,可以分別在3臺(tái)機(jī)器上使用如下命令查看Zookeeper的啟動(dòng)狀態(tài):
[root@master ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.11/bin/../conf/zoo.cfg
Mode: follower
[root@slave1 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.11/bin/../conf/zoo.cfg
Mode: leader
[root@slave2 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.11/bin/../conf/zoo.cfg
Mode: follower
- 在master機(jī)器上啟動(dòng)HDFS:
[root@master ~]# start-dfs.sh
- 在master機(jī)器上啟動(dòng)YARN:
[root@master ~]# start-yarn.sh
- 在slave2機(jī)器上單獨(dú)啟動(dòng)一個(gè)ResourceManager:
[root@slave2 ~]# yarn-daemon.sh start resourcemanager
- 最后,分別在3臺(tái)機(jī)器上使用jps命令查看進(jìn)程,確定是否啟動(dòng)成功:
[root@master ~]# jps
35872 DataNode
28868 QuorumPeerMain
37799 NodeManager
36216 JournalNode
35641 NameNode
36521 DFSZKFailoverController
37625 ResourceManager
40138 Jps
[root@slave1 ~]# jps
35266 JournalNode
35028 DataNode
28213 QuorumPeerMain
38773 Jps
36650 NodeManager
34843 NameNode
35517 DFSZKFailoverController
[root@slave2 ~]# jps
28689 QuorumPeerMain
38953 Jps
36874 NodeManager
38812 ResourceManager
35614 JournalNode
35343 DataNode
如果某一個(gè)NameNode進(jìn)程掛掉了的話,就使用如下命令單獨(dú)啟動(dòng)一個(gè)NameNode:
hadoop-daemon.sh start namenode
下面是停止Hadoop的HA集群的流程:
- 在master機(jī)器上停止HDFS:
[root@master ~]# stop-dfs.sh
- 在master機(jī)器上停止YARN:
[root@master ~]# stop-yarn.sh
- 在slave2機(jī)器上單獨(dú)停止ResourceManager:
[root@slave2 ~]# yarn-daemon.sh stop resourcemanager
- 分別在3臺(tái)機(jī)器上停止Zookeeper:
[root@master ~]# zkServer.sh stop
[root@slave1 ~]# zkServer.sh stop
[root@slave2 ~]# zkServer.sh stop
- 最后,分別在3臺(tái)機(jī)器上使用jps命令查看進(jìn)程,確定有關(guān)進(jìn)程是否停止成功。
[root@master ~]# jps
45307 Jps
[root@slave1]# jps
40713 Jps
[root@slave2 ~]# jps
42108 Jps