Hadoop 單節(jié)點(diǎn)部署 (二)HBase

HBase

Apache HBase? : Hadoop數(shù)據(jù)庫(kù),一個(gè)分布式,可擴(kuò)展,大數(shù)據(jù)存儲(chǔ)。

當(dāng)需要對(duì)大數(shù)據(jù)進(jìn)行隨機(jī)、實(shí)時(shí)讀寫訪問(wèn)的時(shí)候,就需要使用 Apache HBase? 。該項(xiàng)目的目標(biāo)就是在普通的商業(yè)機(jī)器上托管非常大的表(數(shù)十億行×數(shù)百萬(wàn)列)。Apache HBase 是 Google's Bigtable (Chang 等人開發(fā)的結(jié)構(gòu)化數(shù)據(jù)分布式存儲(chǔ)系統(tǒng)) 項(xiàng)目之后的一個(gè)開源的、分布式、版本化、非關(guān)系模型數(shù)據(jù)庫(kù)。證如 Bigtable 利用Google File System提供的分布式存儲(chǔ)一樣,Apache HBase 在HDFS之上提供類似于 Bigtable 的功能。

Features

  • 線性擴(kuò)展和模塊化擴(kuò)展
  • 嚴(yán)格的一致讀寫
  • 自動(dòng)的、可配置的表分片
  • 區(qū)域服務(wù)器之間的自動(dòng)故障轉(zhuǎn)移支持
  • Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables.
  • 便利的客戶端 Java API 接入
  • 向?qū)崟r(shí)查詢提供塊緩存和Bloom過(guò)濾器
  • 以服務(wù)器端過(guò)濾器實(shí)現(xiàn)的Query predicate push down
  • 簡(jiǎn)練的 gateway 和 REST-ful Web service ,支持 XML, Protobuf, 以及二進(jìn)制數(shù)據(jù)編碼
  • 基于jruby(JIRB)的擴(kuò)展shell
  • 支持通過(guò)Hadoop metrics自系統(tǒng)將指標(biāo)發(fā)布給文件或者Ganglia;或者通過(guò)JMX

準(zhǔn)備工作

https://hbase.apache.org/book.html#basic.prerequisites

[root@hadoop opt]# useradd hbase
[root@hadoop opt]# chown -R hbase:hbase hbase-2.2.5*
[root@hadoop opt]# ll
total 0
drwxr-xr-x. 11 hive   hive   221 Oct 22 22:24 apache-hive-3.1.2-bin
drwxr-xr-x. 12 hadoop hadoop 188 Oct 21 06:23 hadoop-3.1.4
drwxr-xr-x.  6 hbase  hbase  170 Oct 23 01:14 hbase-2.2.5
drwxr-xr-x.  5 hbase  hbase  149 Oct 23 01:14 hbase-2.2.5-client
[root@hadoop opt]# su - hbase
[hbase@hadoop ~]$ 

Java

配置conf/hbase-env.sh文件,設(shè)置JAVA_HOME

# The java implementation to use.  Java 1.8+ required.
# export JAVA_HOME=/usr/java/jdk1.8.0/
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_202

ssh

HBase 使用ssh命令和功能賴在集群節(jié)點(diǎn)間通信。集群內(nèi)的每臺(tái)服務(wù)器都需要運(yùn)行ssh,這樣Hadoop 和 HBase daemons 才能被管理。應(yīng)該可以使用密鑰而不是密碼從Mater以及任何backup Master上登錄所有的節(jié)點(diǎn),包括localhost。
Linux or Unix 系統(tǒng)設(shè)置參考 "Procedure: Configure Passwordless SSH Access"。
OS X 設(shè)置參考 SSH: Setting up Remote Desktop and Enabling Self-Login 。

DNS

HBase 使用本地 hostname 來(lái)作為其地址。

NTP

集群上的節(jié)點(diǎn)時(shí)鐘應(yīng)該同步。 少量的偏差是可以接受的,但是大范圍偏差將導(dǎo)致不穩(wěn)定和不可預(yù)測(cè)的事件。如果集群發(fā)生不能解釋的問(wèn)題,首先檢查時(shí)間同步。建議在集群所有節(jié)點(diǎn)運(yùn)行NTP服務(wù)或者其他時(shí)間同步機(jī)制,所有節(jié)點(diǎn)使用同一服務(wù)端進(jìn)行時(shí)間同步。 參考 Basic NTP Configuration 設(shè)置。

Limits on Number of Files and Processes (ulimit)

Apache HBase 是數(shù)據(jù)庫(kù)。那么它就需要一次打開很多文件的能力。很多 Linux 發(fā)行版限制了單個(gè)用戶打開文件的數(shù)量,比如 1024 (or 舊版本OS X 設(shè)置256)。你需要以運(yùn)行HBase的用戶登錄,然后使用命令 ulimit -n 檢查。參考the Troubleshooting section 查看limit設(shè)置過(guò)小導(dǎo)致的問(wèn)題。你也許會(huì)看到如下錯(cuò)誤信息:

2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Exception increateBlockOutputStream java.io.EOFException
2010-04-06 03:04:37,542 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-6935524980745310745_1391901

建議最少設(shè)置為10,000,當(dāng)然 10,240更好,因?yàn)樵撝低ǔR?024的倍數(shù)表示。每個(gè)ColumnFamily至少有一個(gè)StoreFile,如果該區(qū)域正在加載,則可能有六個(gè)以上的StoreFile。打開文件的數(shù)量取決于ColumnFamilies 的數(shù)量和區(qū)域的數(shù)量。如下是計(jì)算RegionServer打開文件數(shù)量的簡(jiǎn)易公式:

(StoreFiles per ColumnFamily) x (regions per RegionServer)

假設(shè),一個(gè)schema在每個(gè)區(qū)域有3個(gè)ColumnFamilies, 每個(gè)ColumnFamily有3個(gè)StoreFiles, 每個(gè)RegionServer有100個(gè)regions,那么JVM 將打開 3 * 3 * 100 = 900 個(gè)文件,還不包括打開的JAR文件,配置文件等。打開一個(gè)文件不需要很多資源,因此一個(gè)用戶打開很多文件的風(fēng)險(xiǎn)是很小的 。

另一個(gè)相關(guān)的設(shè)置是單次用戶允許運(yùn)行的進(jìn)程數(shù)。在Linux 和 Unix,通過(guò)ulimit -u命令進(jìn)行設(shè)置。 不能與nproc命令混淆,nproc控制指定用戶可用的CPU數(shù)量。在工作負(fù)載下,ulimit -u 太低會(huì)導(dǎo)致OutOfMemoryError錯(cuò)誤。

給運(yùn)行HBase的用戶配置最大文件描述和進(jìn)程是操作系統(tǒng)配置,而不是HBase軟件配置。確保運(yùn)行HBase的用戶配置文件已經(jīng)變更是很重要的。 查看哪個(gè)用戶啟動(dòng)HBase,以及該用戶的ulimit配置,查看該實(shí)例HBase log 的第一行即可。

Example 1. ulimit Settings on Ubuntu

在Ubuntu上配置ulimit,編輯 /etc/security/limits.conf 文件,以空格分隔的4列。參考 limits.conf 的 man 文檔獲取更多相關(guān)內(nèi)容。在如下范例,第一行設(shè)置hadoop用戶打開文件數(shù)量的soft和hard limits 為 32768;第二行設(shè)置hadoop用戶打開進(jìn)程的數(shù)量32000。

hadoop  -       nofile  32768
hadoop  -       nproc   32000

這些設(shè)置僅在 Pluggable Authentication Module (PAM) 可以直接調(diào)用它們的時(shí)候生效。要配置PAM使用這些limits,確保/etc/pam.d/common-session文件包含如下行:

session required  pam_limits.so

Linux Shell

HBase的所有腳本依賴于 GNU Bash shell。

Windows

不推薦在 Windows 設(shè)備上運(yùn)行生產(chǎn)系統(tǒng)。

HBase 有兩種運(yùn)行模式: standalonedistributed。HBase 開箱即用運(yùn)行于 standalone 模式。不管哪個(gè)模式,都需要配置 HBase,配置文件在 conf 目錄。 至少,你必須配置 conf/hbase-env.sh 來(lái)使HBase知道運(yùn)行哪個(gè)java。在這個(gè)文件,你可以設(shè)置 HBase 的環(huán)境變量比如 heapsize 等JVM參數(shù),log 文件的保存位置等等。JAVA_HOME 指向java 安裝目錄的根目錄。

單點(diǎn)

standalone
這是默認(rèn)模式。Standalone 描述在 quickstart 部分。在standalone mode, HBase 不支持 HDFS?—?使用本地文件系統(tǒng)?—?并且運(yùn)行所有的 HBase daemons 和一個(gè)local ZooKeeper 在同一個(gè)JVM下。ZooKeeper 綁定眾所周知的端口,這樣客戶端可以連到HBase。

standalone.over.hdfs
一個(gè)常見變種 Standalone HBase ,運(yùn)行所有的daemons在一個(gè)JVM,但是不保存于本地系統(tǒng),而是保存于HDFS。

要配置該 standalone 變種,編輯hbase-site.xml文件,設(shè)置hbase.rootdir指向HDFS實(shí)例的一個(gè)目錄,但是設(shè)置hbase.cluster.distributed為false。比如:

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://namenode.example.org:8020/hbase</value>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>false</value>
  </property>
</configuration>

https://hbase.apache.org/book.html#distributed
Distributed mode 又可以細(xì)分為所有daemons分布運(yùn)行于一臺(tái)服務(wù)器的pseudo-distributed,和所有>daemons分布運(yùn)行于集群內(nèi)各節(jié)點(diǎn)的fully-distributed。偽分布式 pseudo-distributed vs 完全分布式 fully-distributed命令法來(lái)自于Hadoop。

Pseudo-distributed 模式可以運(yùn)行在本地文件系統(tǒng),也可以運(yùn)行在HDFS。Fully-distributed 模式只能運(yùn)行于 HDFS。參考 Hadoop documentation 設(shè)置 HDFS。推薦一份在Hadoop 2 上設(shè)置HDFS的文檔 http://www.alexjf.net/blog/distributed-systems/hadoop-yarn-installation-definitive-guide。

偽分布式

https://hbase.apache.org/book.html#pseudo
pseudo-distributed 模式是運(yùn)行于單機(jī)模式的簡(jiǎn)化 fully-distributed 模式。該 HBase 配置僅用于測(cè)試和原型設(shè)計(jì),不能用于生產(chǎn)和性能測(cè)試。

試過(guò)了上面的單點(diǎn)模式,我們重新配置HBase運(yùn)行于偽分布式。偽分布式的意思是HBase仍然運(yùn)行于一臺(tái)服務(wù)器,但是HBase daemon (HMaster, HRegionServer, and ZooKeeper) 各自運(yùn)行于獨(dú)立進(jìn)程:在單點(diǎn)模式下所有的daemons運(yùn)行于一個(gè)jvm進(jìn)程/實(shí)例。默認(rèn),除非你配置hbase.rootdir屬性,那么你的數(shù)據(jù)將保存于 /tmp/目錄。這里,我們將配置數(shù)據(jù)保存于HDFS。你也可以跳過(guò)HDFS配置,將數(shù)據(jù)保存于本地文件系統(tǒng)。

  1. 創(chuàng)建HDFS目錄
[hadoop@hadoop hadoop-3.1.4]$ bin/hdfs dfs -mkdir /apps
[hadoop@hadoop hadoop-3.1.4]$ bin/hdfs dfs -chmod 777 /apps
[hadoop@hadoop hadoop-3.1.4]$ bin/hdfs dfs -ls /
Found 2 items
drwxrwxrwx   - hadoop supergroup          0 2020-10-27 00:08 /apps
drwxr-xr-x   - hadoop supergroup          0 2020-10-26 21:46 /user
  1. 配置HBase
    編輯hbase-site.xml
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://10.0.31.65:9000/apps/hbase</value>
  </property>
  <property>
    <name>hbase.unsafe.stream.capability.enforce</name>
    <value>false</value>
  </property>

無(wú)需在HDFS創(chuàng)建目錄。HBase會(huì)自動(dòng)創(chuàng)建。如果你創(chuàng)建了目錄,HBase會(huì)認(rèn)為是遷移操作。

移除 hbase.tmp.dirhbase.unsafe.stream.capability.enforce配置。

  1. 啟動(dòng)HBase
    使用bin/start-hbase.sh命令啟動(dòng) HBase。如果你的系統(tǒng)配置正確,jps命令會(huì)看到HMaster 和 HRegionServer 進(jìn)程在運(yùn)行。
[hbase@hadoop hbase-2.2.5]$ bin/start-hbase.sh 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase-2.2.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase-2.2.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
localhost: running zookeeper, logging to /opt/hbase-2.2.5/bin/../logs/hbase-hbase-zookeeper-hadoop.out
running master, logging to /opt/hbase-2.2.5/bin/../logs/hbase-hbase-master-hadoop.out
: regionserver running as process 6509. Stop it first.   

[hbase@hadoop hbase-2.2.5]$ jps 
9507 HQuorumPeer
9574 HMaster
6509 HRegionServer
9935 Jps

  1. 檢查HDFS創(chuàng)建的HBase目錄
[hadoop@hadoop hadoop-3.1.4]$ bin/hdfs dfs -ls /
Found 2 items
drwxrwxrwx   - hadoop supergroup          0 2020-10-27 00:19 /apps
drwxr-xr-x   - hadoop supergroup          0 2020-10-26 21:46 /user
[hadoop@hadoop hadoop-3.1.4]$ bin/hdfs dfs -ls /apps
Found 1 items
drwxr-xr-x   - hbase supergroup          0 2020-10-27 01:12 /apps/hbase
[hadoop@hadoop hadoop-3.1.4]$ bin/hdfs dfs -ls /apps/hbase
Found 12 items
drwxr-xr-x   - hbase supergroup          0 2020-10-27 00:19 /apps/hbase/.hbck
drwxr-xr-x   - hbase supergroup          0 2020-10-27 01:12 /apps/hbase/.tmp
drwxr-xr-x   - hbase supergroup          0 2020-10-27 01:12 /apps/hbase/MasterProcWALs
drwxr-xr-x   - hbase supergroup          0 2020-10-27 01:12 /apps/hbase/WALs
drwxr-xr-x   - hbase supergroup          0 2020-10-27 00:19 /apps/hbase/archive
drwxr-xr-x   - hbase supergroup          0 2020-10-27 00:19 /apps/hbase/corrupt
drwxr-xr-x   - hbase supergroup          0 2020-10-27 01:12 /apps/hbase/data
-rw-r--r--   1 hbase supergroup         42 2020-10-27 00:19 /apps/hbase/hbase.id
-rw-r--r--   1 hbase supergroup          7 2020-10-27 00:19 /apps/hbase/hbase.version
drwxr-xr-x   - hbase supergroup          0 2020-10-27 00:19 /apps/hbase/mobdir
drwxr-xr-x   - hbase supergroup          0 2020-10-27 01:12 /apps/hbase/oldWALs
drwx--x--x   - hbase supergroup          0 2020-10-27 00:19 /apps/hbase/staging
[hadoop@hadoop hadoop-3.1.4]$ 

  1. 創(chuàng)建表、填充數(shù)據(jù)
    你可以使用HBase Shell來(lái)創(chuàng)建表、填充數(shù)據(jù)、掃描以及獲取數(shù)據(jù),使用相同的程序 shell exercises。
[hbase@hadoop hbase-2.2.5]$ bin/hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase-2.2.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.2.5, rf76a601273e834267b55c0cda12474590283fd4c, 2020年 05月 21日 星期四 18:34:40 CST
Took 0.0042 seconds                                                                                                                                                                                                                                                           
hbase(main):001:0> crate 'test','cf'
NoMethodError: undefined method `crate' for main:Object
Did you mean?  create

hbase(main):002:0> create 'test','cf'
Created table test
Took 2.9205 seconds                                                                                                                                                                                                                                                           
=> Hbase::Table - test
hbase(main):003:0> list 'test'
TABLE                                                                                                                                                                                                                                                                         
test                                                                                                                                                                                                                                                                          
1 row(s)
Took 0.0533 seconds                                                                                                                                                                                                                                                           
=> ["test"]
hbase(main):004:0> describe 'test'
Table test is ENABLED                                                                                                                                                                                                                                                         
test                                                                                                                                                                                                                                                                          
COLUMN FAMILIES DESCRIPTION                                                                                                                                                                                                                                                   
{NAME => 'cf', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER 
=> 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}                                                                     

1 row(s)

QUOTAS                                                                                                                                                                                                                                                                        
0 row(s)
Took 0.3261 seconds                                                                                                                                                                                                                                                           
hbase(main):005:0> put 'test', 'row1', 'cf:a', 'value1'
Took 0.0904 seconds                                                                                                                                                                                                                                                           
hbase(main):006:0> put 'test', 'row2', 'cf:b', 'value2'
Took 0.0208 seconds                                                                                                                                                                                                                                                           
hbase(main):007:0> put 'test', 'row3', 'cf:c', 'value3'
Took 0.0090 seconds                                                                                                                                                                                                                                                           
hbase(main):008:0> scan 'test'
ROW                                                                  COLUMN+CELL                                                                                                                                                                                              
 row1                                                                column=cf:a, timestamp=1603732800927, value=value1                                                                                                                                                       
 row2                                                                column=cf:b, timestamp=1603732808455, value=value2                                                                                                                                                       
 row3                                                                column=cf:c, timestamp=1603732815117, value=value3                                                                                                                                                       
3 row(s)
Took 0.0355 seconds                                                                                                                                                                                                                                                           
hbase(main):009:0> get 'test', 'row1'
COLUMN                                                               CELL                                                                                                                                                                                                     
 cf:a                                                                timestamp=1603732800927, value=value1                                                                                                                                                                    
1 row(s)
Took 0.0210 seconds                                                                                                                                                                                                                                                           
hbase(main):010:0> drop 'test'

ERROR: Table test is enabled. Disable it first.

For usage try 'help "drop"'

Took 0.0652 seconds                                                                                                                                                                                                                                                           
hbase(main):011:0> disable 'test'
Took 1.3450 seconds                                                                                                                                                                                                                                                           
hbase(main):012:0> enable 'test'
Took 0.7684 seconds                                                                                                                                                                                                                                                           
hbase(main):013:0> disable 'test'
Took 0.7596 seconds                                                                                                                                                                                                                                                           
hbase(main):014:0> drop 'test'
Took 0.4805 seconds                                                                                                                                                                                                                                                           
hbase(main):015:0> exit
[hbase@hadoop hbase-2.2.5]$ 

  1. 啟動(dòng)&停止 備份 HBase Master (HMaster) 服務(wù)器

在同一硬件上運(yùn)行多個(gè)HMaster實(shí)例在生產(chǎn)環(huán)境中沒有意義,就像運(yùn)行偽分布式集群對(duì)生產(chǎn)環(huán)境沒有意義一樣。

此步驟僅用于測(cè)試和學(xué)習(xí)目的。

HMaster服務(wù)器控制HBase集群。你可以最多啟動(dòng)9臺(tái)backup HMaster服務(wù)器,含主服務(wù)器一共10臺(tái)。使用 local-master-backup.sh啟動(dòng)backup HMaster。對(duì)于要啟動(dòng)的每一臺(tái)backup master,添加一個(gè)相對(duì)master偏移的端口參數(shù)。每臺(tái)HMaster使用2個(gè)端口 (默認(rèn)16000 和 16010)。端口便宜要添加到這些端口,所以如果偏移量是2,那么 backup HMaster 將使用端口 16002 和 16012。如下命令啟動(dòng)了3臺(tái)backup servers,使用端口 16002/16012, 16003/16013, and 16005/16015。

$ ./bin/local-master-backup.sh start 2 3 5

要?dú)⒌粢粋€(gè)backup master而不是整個(gè)集群,你需要先找到它的進(jìn)程ID (PID)。PID 存在一個(gè)文件內(nèi),文件名類似于/tmp/hbase-USER-X-master.pid。文件的唯一內(nèi)容就是PID。你可以使用kill -9命令殺掉該P(yáng)ID。如下的命令會(huì)殺掉端口偏移量1的進(jìn)程,而不影響集群運(yùn)行:

$ cat /tmp/hbase-testuser-1-master.pid |xargs kill -9
  1. Start and stop additional RegionServers
    HRegionServer根據(jù)HMaster的指示管理在StoreFiles內(nèi)的數(shù)據(jù)。 通常,集群內(nèi)的每個(gè)節(jié)點(diǎn)運(yùn)行一個(gè)HRegionServer。在同一個(gè)系統(tǒng)上運(yùn)行多個(gè)HRegionServers對(duì)于測(cè)試偽分布模式很有用。 命令local-regionservers.sh允許你運(yùn)行多個(gè)RegionServers。它和local-master-backup.sh命令工作方式很像,提交的每個(gè)參數(shù)代表一個(gè)實(shí)例的端口偏移量。每個(gè)RegionServer需要2個(gè)端口,默認(rèn)為 16020 和 16030。自HBase version 1.1.0以來(lái),HMaster不再使用region server端口,這給RegionServers留下了10個(gè)端口 ports (16020 - 16029 and 16030 - 16039) 可用。要支持額外的RegionServers,在運(yùn)行local-regionservers.sh等命令前,設(shè)置環(huán)境變量HBASE_RS_BASE_PORTHBASE_RS_INFO_BASE_PORT到適當(dāng)?shù)闹?。如果base ports的值設(shè)置為16200 和 16300,那么一臺(tái)服務(wù)器可以支持額外99個(gè)RegionServers。如下命令啟動(dòng)了4臺(tái)額外RegionServers,從16022/16032 (base ports 16020/16030 plus 2)開始的順序端口上運(yùn)行。
$ .bin/local-regionservers.sh start 2 3 4 5

手動(dòng)停止RegionServer,使用local-regionservers.sh命令帶 stop 參數(shù) 和端口偏移量。

$ .bin/local-regionservers.sh stop 3
  1. Stop HBase.
    bin/stop-hbase.sh命令

分布式

https://hbase.apache.org/book.html#fully_dist
By default, HBase runs in stand-alone mode. Both stand-alone mode and pseudo-distributed mode are provided for the purposes of small-scale testing. For a production environment, distributed mode is advised. In distributed mode, multiple instances of HBase daemons run on multiple servers in the cluster.

Just as in pseudo-distributed mode, a fully distributed configuration requires that you set the hbase.cluster.distributed property to true. Typically, the hbase.rootdir is configured to point to a highly-available HDFS filesystem.

In addition, the cluster is configured so that multiple cluster nodes enlist as RegionServers, ZooKeeper QuorumPeers, and backup HMaster servers. These configuration basics are all demonstrated in quickstart-fully-distributed.

Distributed RegionServers

Typically, your cluster will contain multiple RegionServers all running on different servers, as well as primary and backup Master and ZooKeeper daemons. The conf/regionservers file on the master server contains a list of hosts whose RegionServers are associated with this cluster. Each host is on a separate line. All hosts listed in this file will have their RegionServer processes started and stopped when the master server starts or stops.

ZooKeeper and HBase

See the ZooKeeper section for ZooKeeper setup instructions for HBase.

Example Distributed HBase Cluster
This is a bare-bones conf/hbase-site.xml for a distributed HBase cluster. A cluster that is used for real-world work would contain more custom configuration parameters. Most HBase configuration directives have default values, which are used unless the value is overridden in the hbase-site.xml. See "Configuration Files" for more information.

<configuration>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://namenode.example.org:8020/hbase</value>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>node-a.example.com,node-b.example.com,node-c.example.com</value>
  </property>
</configuration>

This is an example conf/regionservers file, which contains a list of nodes that should run a RegionServer in the cluster. These nodes need HBase installed and they need to use the same contents of the conf/ directory as the Master server.

node-a.example.com
node-b.example.com
node-c.example.com

This is an example conf/backup-masters file, which contains a list of each node that should run a backup Master instance. The backup Master instances will sit idle unless the main Master becomes unavailable.

node-b.example.com
node-c.example.com

Distributed HBase Quickstart

See quickstart-fully-distributed for a walk-through of a simple three-node cluster configuration with multiple ZooKeeper, backup HMaster, and RegionServer instances.

Procedure: HDFS Client Configuration

  1. Of note, if you have made HDFS client configuration changes on your Hadoop cluster, such as configuration directives for HDFS clients, as opposed to server-side configurations, you must use one of the following methods to enable HBase to see and use these configuration changes:

    1. Add a pointer to your HADOOP_CONF_DIR to the HBASE_CLASSPATH environment variable in hbase-env.sh.

    2. Add a copy of hdfs-site.xml (or hadoop-site.xml) or, better, symlinks, under ${HBASE_HOME}/conf, or

    3. if only a small set of HDFS client configurations, add them to hbase-site.xml.

An example of such an HDFS client configuration is dfs.replication. If for example, you want to run with a replication factor of 5, HBase will create files with the default of 3 unless you do the above to make the configuration available to HBase.

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

友情鏈接更多精彩內(nèi)容