人妻熟妇精品五区在线,红桃扣b视频

HBase是一個在HDFS上開發(fā)的面向列的分布式數(shù)據(jù)庫。如果需要實(shí)時地隨機(jī)訪問超大規(guī)模數(shù)據(jù)集，就可以使用HBase。本篇介紹HBase的基礎(chǔ)知識，包括安裝配置、部署運(yùn)行、表的創(chuàng)建和Java api的使用。

一、安裝

1.1 下載HBase
http://www.apache.org/dyn/closer.cgi/hbase/
下載需要的版本。

1.2 安裝HBase
將下載的壓縮包解壓到安裝目錄。

tar -zxvf 壓縮包 安裝目錄

1.3 配置HBase
HBase有兩個運(yùn)行模式：單機(jī)模式和分布式模式。而分布式模式有分為偽分布式模式（所有守護(hù)進(jìn)程都運(yùn)行在單個節(jié)點(diǎn)上）和完全分布式模式（進(jìn)程運(yùn)行在物理服務(wù)器集群中）。
分布式模式依賴Hadoop分布式文件系統(tǒng)，即HDFS。所以在配置之前，一定要確保有一個合適的，正在工作的HDFS集群。這里我們采用完全分布式模式。
1）首先進(jìn)入HBase的安裝目錄，編輯 /conf/hbase-site.xml文件，修改內(nèi)容如下：

<property>
    <name>hbase.rootdir</name> 
    <value>hdfs://probd/hbase</value> 
    <description>probd is the cluster name of hdfs.</description> 
</property>

<property>
    <name>hbase.cluster.distributed</name> 
    <value>true</value> 
 </property>

這里添加hbase.cluster.distributed屬性并設(shè)置為true，添加hbase.rootdir并設(shè)置為HDFS中NameNode的訪問地址，HBase會將數(shù)據(jù)寫到NameNode主機(jī)上的/hbase目錄下。
2）配置region服務(wù)器。編輯/conf/regionservers文件，該文件列出了所有運(yùn)行HRegionServer守護(hù)進(jìn)程的主機(jī)，每個主機(jī)獨(dú)占一行（類似Hadoop中的slaves文件）:

probd01
probd02
probd03
probd04

HBase集群啟動和關(guān)閉時會按照該文件中羅列的主機(jī)逐一執(zhí)行。

3）配置zookeeper。分布式的HBase依賴于zookeeper集群，所有的節(jié)點(diǎn)和客戶端都必須能夠正常訪問zookeeper。HBase默認(rèn)管理一個單點(diǎn)的zookeeper集群，用戶通過啟動和關(guān)閉腳本就可以把zookeeper當(dāng)做HBase的一部分來啟動和關(guān)閉進(jìn)程。用戶也可以不依賴于HBase管理zookeeper集群，只需為HBase指出需要使用的集群即可。這里我們使用后一種，即使用已有的zookeeper集群。我們需要在/conf/hbase-env.sh文件中將HBASE_MANAGES_ZK屬性設(shè)置為false：

export HBASE_MANAGES_ZK=false

然后，修改hbase-site.xml文件，設(shè)置zookeeper的連接地址與客戶端端口號：

        <property>
                <name>hbase.zookeeper.quorum</name> 
                <value>probd01:2181,probd02:2181,probd03:2181</value> 
        </property> 
        <property>
                <name>hbase.zookeeper.property.dataDir</name> 
                <value>/probd/zookeeper-3.4.6/data/data</value>
        </property>
        <property> 
                <name>hbase.zookeeper.property.clientPort</name>
                <value>2181</value>  
        </property>

zookeeper的地址與端口號請參照自己集群的實(shí)際值。

這些配置完成后，需要復(fù)制conf目錄到集群的其它節(jié)點(diǎn)上以完成同步集群的配置。

1.4 配置文件介紹
該小結(jié)不屬于相關(guān)操作，只是介紹一下上面配置過程中所涉及的3個文件。
1）hbase-site.xml
在Hadoop中，如果用戶需要增加HDFS的特定配置就要添加到hdfs-site.xml文件中。與此類似，在HBase中，用戶需要增加配置信息就需要將配置添加到conf/hbase-site.xml文件中。配置參數(shù)可以查看HBase 目錄 src/main/resources中的源文件hbase-default.xml，在doc目錄中也有配置參數(shù)信息的HTML文檔（但也并非所有的配置信息都羅列在hbase-default.xml中，配置中有些參數(shù)并不常用并且只在源碼中存在，因此，唯一的辦法就是通過閱讀源碼來查找這些配置參數(shù)的作用）。
進(jìn)程啟動后，服務(wù)器會讀取hbase-default.xml文件，然后讀取hbase-site.xml文件，hbase-site.xml文件的內(nèi)容會覆蓋hbase-default.xml中的內(nèi)容。
2）hbase-env.sh
HBase的環(huán)境變量等信息需要在這個文件中設(shè)置，例如，HBase守護(hù)進(jìn)程的JVM啟動參數(shù)；Java堆大小和垃圾回收策略等。在這個文件中還可以設(shè)置HBase配置文件的目錄、日志目錄、SSH選項、進(jìn)程pid文件的目錄等。
3）regionservers
這個文件羅列了所有region服務(wù)器的主機(jī)名，它是純文本文件，文件中的每一行都是主機(jī)名。HBase的運(yùn)維腳本會依次迭代訪問每一行來啟動所有region服務(wù)器進(jìn)程。

1.5 部署運(yùn)行HBase
當(dāng)我們配置好HBase后，接下來就要咋集群上部署HBase。在啟動HBase之前，首先要確保HDFS已經(jīng)啟動并處于工作狀態(tài)。由于我們不依賴HBase管理zookeeper，所以還要確保zookeeper已經(jīng)運(yùn)行，否則HBase會把zookeeper作為進(jìn)程的一部分啟動。之后我們進(jìn)入bin目錄下,執(zhí)行腳本，啟動HBase：

start-hbase.sh

HBase的運(yùn)行日志寫入了logs目錄的子目錄中。接下來，我們將介紹如何建表、添加數(shù)據(jù)、掃描已插入的數(shù)據(jù)、禁用表和刪除表等操作。

二、HBase的基本操作

首先，啟動HBase的交互環(huán)境Shell：

$HBASE_HOME/bin/hbase shell

現(xiàn)在來創(chuàng)建一個簡單的表并增加幾行數(shù)據(jù)：

hbase(main):002:0> create 'testtable','colfam1'
0 row(s) in 0.2930 seconds
hbase(main):003:0> list 'testtable'
TABLE
testtable
1 row(s) in 0.0520 seconds
hbase(main):004:0> put 'testtable','myrow-1','colfam1:q1','value-1'
0 row(s) in 0.1020 seconds
hbase(main):005:0> put 'testtable','myrow-2','colfam1:q2','value-2'
0 row(s) in 0.0410 seconds
hbase(main):006:0> put 'testtable','myrow-2','colfam1:q3','value-3'
0 row(s) in 0.0380 seconds

通過一條命令，我們創(chuàng)建了一張帶有一個列族的表，我們可以通過list命令來檢查這張表是否已經(jīng)存在。然后我們存放了幾行數(shù)據(jù)：通過兩個不同的行健myrow-1和myrow-2把新增數(shù)據(jù)添加到兩個不同的行中。有了一個名為colfam1的列族之后，還有添加一個任意限定符才能形成實(shí)際的列，如colfam1:q1、colfam1:q2、colfam1:q3。
接下來，我們看新增的數(shù)據(jù)是否能被檢索，這里用到scan命令：

hbase(main):007:0> scan 'testtable'
ROW          COLUMN_CELL
  myrow-1    column=colfam1:q1,timestamp=1559206303975,value= value-1
  myrow-2    column=colfam1:q2,timestamp=1559206304013,value= value-2
  myrow-2    column=colfam1:q3,timestamp=1559206304069,value= value-3

2 row(s) in 0.1400 seconds

我們可以看到HBase打印數(shù)據(jù)時是通過面向單元格的方式分別輸出每一列數(shù)據(jù)。可以看到確實(shí)打印了兩次myrow-2，和預(yù)期的一樣，后面還顯示了每一列的實(shí)際數(shù)值。
如果我們想要獲取單行數(shù)據(jù)，可以使用get命令：

hbase(main):008:0>  get 'testtable','myrow-1'
ROW          COLUMN_CELL
colfam1:q1   timestamp=1559206752410,value= value-1

1 row(s) in 0.0480 seconds

刪除數(shù)據(jù)也是基本操作之一，我們來看一下刪除一個具體的單元格，并檢查數(shù)據(jù)是否真的刪除了：

hbase(main):009:0>  delete 'testtable','myrow-2','colfam1:q2'
0 row(s) in 0.0390 seconds

hbase(main):010:0> scan 'testtable'
ROW          COLUMN_CELL
  myrow-1    column=colfam1:q1,timestamp=1559207110853,value= value-1
  myrow-2    column=colfam1:q3,timestamp=1559207110868,value= value-3

2 row(s) in 0.0620 seconds

我們看到，通過delete命令確實(shí)刪除了表中的數(shù)據(jù)。接下來，我們刪除這張表：

hbase(main):011:0> disable 'testtable'
0 row(s) in 2.1250 seconds

hbase(main):012:0> drop 'testtable'
0 row(s) in 2878 seconds

刪除表之前需要禁用表，然后再刪除。
然后，通過輸入exit命令關(guān)閉Shell并返回命令行窗口：

hbase(main):013:0> exit
$ _

最后，運(yùn)行stop-hbase.sh腳本關(guān)閉HBase系統(tǒng)：

$HBASE_HOME/bin/stop-hbase.sh
stopping hbase..........

一旦啟動了這個腳本，將會看到一條描述集群正在停止的信息，該信息會周期性的打印“.”字符，這僅僅表明腳本正在運(yùn)行，并不是運(yùn)行進(jìn)度的反饋或隱藏的有用信息。關(guān)閉腳本大概需要幾分鐘完成。如果集群中機(jī)器的數(shù)量很多，那么執(zhí)行的時間可能更長。我們在關(guān)閉Hadoop集群之前一定要確認(rèn)HBase已經(jīng)被正常關(guān)閉了。

三、HBase API的使用

3.1 HBaseConfiguration：這是每個hbase client都會使用到的對象，它代表hbase的配置信息。具有兩個構(gòu)造函數(shù)：

/**
   * Instantiating HBaseConfiguration() is deprecated. Please use
   * HBaseConfiguration#create() to construct a plain Configuration
   * @deprecated Please use create() instead.
   */
  @Deprecated
  public HBaseConfiguration() {
    //TODO:replace with private constructor, HBaseConfiguration should not extend Configuration
    super();
    addHbaseResources(this);
    LOG.warn("instantiating HBaseConfiguration() is deprecated. Please use"
        + " HBaseConfiguration#create() to construct a plain Configuration");
  }

  /**
   * Instantiating HBaseConfiguration() is deprecated. Please use
   * HBaseConfiguration#create(conf) to construct a plain Configuration
   * @deprecated Please user create(conf) instead.
   */
  @Deprecated
  public HBaseConfiguration(final Configuration c) {
    //TODO:replace with private constructor
    this();
    merge(this, c);
  }

可以看到兩個構(gòu)造函數(shù)在高版本中已經(jīng)過時了，取而代之的是兩個靜態(tài)方法create：

    /**
   * Creates a Configuration with HBase resources
   * @return a Configuration with HBase resources
   */
  public static Configuration create() {
    Configuration conf = new Configuration();
    // In case HBaseConfiguration is loaded from a different classloader than
    // Configuration, conf needs to be set with appropriate class loader to resolve
    // HBase resources.
    conf.setClassLoader(HBaseConfiguration.class.getClassLoader());
    return addHbaseResources(conf);
  }

  /**
   * @param that Configuration to clone.
   * @return a Configuration created with the hbase-*.xml files plus
   * the given configuration.
   */
  public static Configuration create(final Configuration that) {
    Configuration conf = create();
    merge(conf, that);
    return conf;
  }

默認(rèn)的構(gòu)造函數(shù)會加載hbase-default.xml和hbase-site.xml中的配置信息:

    public static Configuration addHbaseResources(Configuration conf) {
        conf.addResource("hbase-default.xml");
        conf.addResource("hbase-site.xml");
        checkDefaultsVersion(conf);
        return conf;
    }

如果classpath沒有這兩個文件，就需要你自己設(shè)置配置：

Configuration conf = new Configuration();
conf.set(“hbase.zookeeper.quorum”, “zkServer”);
conf.set(“hbase.zookeeper.property.clientPort”, “2181″);
HBaseConfiguration hbaseConf = new HBaseConfiguration(conf);

3.2 創(chuàng)建表
創(chuàng)建表之前首先需要連接上HBase:

Configuration conf = HBaseConfiguration.create();
//創(chuàng)建hbase的連接，這是一個分布式連接
Connection conn = ConnectionFactory.createConnection(conf);

連接上HBase后，可以通過Connection來獲取HBaseAdmin實(shí)例：

//獲取HBaseAdmin實(shí)例
HBaseAdmin admin = (HBaseAdmin) conn.getAdmin();

創(chuàng)建表是通過HBaseAdmin對象來操作的,HBaseAdmin負(fù)責(zé)表的META信息處理。它提供了createTable這個方法：

public void createTable(TableDescriptor desc)
public void createTable(TableDescriptor desc, byte [] startKey,byte [] endKey, int numRegions)
public void createTable(final TableDescriptor desc, byte [][] splitKeys)

下面，我們來創(chuàng)建3個列族的表：


Configuration conf = HBaseConfiguration.create();
//創(chuàng)建hbase的連接，這是一個分布式連接
Connection conn = ConnectionFactory.createConnection(conf);
//這個admin是管理table時使用的，比如說創(chuàng)建表
HBaseAdmin admin = (HBaseAdmin) conn.getAdmin();

//創(chuàng)建表名
TableName tableName = TableName.valueOf("user");
//創(chuàng)建列族
//ColumnFamilyDescriptor：代表的是列族的schema
ColumnFamilyDescriptor family1 = ColumnFamilyDescriptorBuilder.of("info1");
ColumnFamilyDescriptor family2 = ColumnFamilyDescriptorBuilder.of("info2");
ColumnFamilyDescriptor family3 = ColumnFamilyDescriptorBuilder.of("info3");

//TableDescriptor：代表的是表的schema
TableDescriptor table = TableDescriptorBuilder.newBuilder(tableName)
       .addColumnFamily(family1)
       .addColumnFamily(family2)
       .addColumnFamily(family3)
       .build();

//創(chuàng)建表
admin.createTable(table);

3.3 插入和修改數(shù)據(jù)

table = conn.getTable(TableName.valueOf("user"));
//構(gòu)造參數(shù)是row_key，必傳
Put put = new Put(Bytes.toBytes("zhangsan_123"));
//這里的參數(shù)依次為:列族名，列名，值
put.addColumn(Bytes.toBytes("info2"),Bytes.toBytes("name"),Bytes.toBytes("lisi"));
put.addColumn(Bytes.toBytes("info2"),Bytes.toBytes("age"),Bytes.toBytes(22 ));
put.addColumn(Bytes.toBytes("info2"),Bytes.toBytes("sex"),Bytes.toBytes("男"));
put.addColumn(Bytes.toBytes("info2"),Bytes.toBytes("address"),Bytes.toBytes("天堂" ));
table.put(put);
//table.put(List<Put>); //通過一個List集合，可以添加一個集合

3.4 查詢數(shù)據(jù)

查詢單條數(shù)據(jù)

String rowKey = "zhangsan_123";
Get get = new Get(Bytes.toBytes(rowKey));
Result result = table.get(get);
byte[] address = result.getValue(Bytes.toBytes("info2"), Bytes.toBytes("address")); 
byte[] name = result.getValue(Bytes.toBytes("info2"), Bytes.toBytes("name")); 
byte[] sex = result.getValue(Bytes.toBytes("info2"), Bytes.toBytes("sex"));
byte[] age = result.getValue(Bytes.toBytes("info2"), Bytes.toBytes("age")); 
System.out.print(Bytes.toString(name) + ",");
System.out.print(Bytes.toString(sex) + ",");
System.out.print(Bytes.toString(address) + ",");
System.out.print(Bytes.toInt(age) + ",");
System.out.println();

全表掃描
完整的全表掃描要慎用，不過全表掃描中可以指定很多過濾器，我們可以很好的使用它。

Scan scan = new Scan();
ResultScanner resultScanner = table.getScanner(scan);
printResult(resultScanner);

區(qū)間掃描
區(qū)間掃描我們用處也很多，因?yàn)閞ow key是按照字典序來排列的，我們可以根據(jù)這個特性，查找某一個用戶指定時間段的數(shù)據(jù)，比如查詢用戶A昨天到今天的數(shù)據(jù)在查詢的過程中，我們還可以指定返回的結(jié)果，比如指定返回一個列族，以及返回指定的列，這樣可以增加查詢速度

Scan scan = new Scan();
scan.withStartRow(Bytes.toBytes("zhangsan_1232")); //設(shè)置開始行
scan.withStopRow(Bytes.toBytes("zhangsan_12352")); //設(shè)置結(jié)束行
scan.addFamily(Bytes.toBytes("info2"));//查詢指定列族
ResultScanner resultScanner = table.getScanner(scan);
printResult(resultScanner);

列值過濾器
列值過濾器也是我們很常用的操作，它提供了一種甚至值的條件查詢。類似sql中的where field = 'xxx'，這極大的擴(kuò)展了hbase的查詢方式，因?yàn)槿绻皇歉鶕?jù)row key來查詢，那么很多產(chǎn)景都不適用hbase。

Scan scan = new Scan();
/*
* 第一個參數(shù)： 列族
* 第二個參數(shù)： 列名
* 第三個參數(shù)： 是一個枚舉類型
*              CompareOp.EQUAL  等于
*              CompareOp.LESS  小于
*              CompareOp.LESS_OR_EQUAL  小于或等于
*              CompareOp.NOT_EQUAL  不等于
*              CompareOp.GREATER_OR_EQUAL  大于或等于
*              CompareOp.GREATER  大于
*/
SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(Bytes.toBytes("info2"), Bytes.toBytes("name"), CompareFilter.CompareOp.GREATER_OR_EQUAL, Bytes.toBytes("zhangsan8"));
//這個方法很重要，需要注意，當(dāng)此過濾器過濾時，如果遇到該列值為NULL的情況，如果設(shè)置的參數(shù)為true，則會過濾掉這一行，如果設(shè)置的參數(shù)為false，那么則會把這一行的結(jié)果返回，默認(rèn)為false
singleColumnValueFilter.setFilterIfMissing(true);
scan.setFilter(singleColumnValueFilter);

ResultScanner resultScanner = table.getScanner(scan);
printResult(resultScanner);

前綴過濾器
前綴過濾器和表里面的內(nèi)容沒有關(guān)系，它只是用來匹配指定的列的，比如有這樣兩個列 name1 和name2 ，通過這個過濾器，就會查詢這兩個列的所有數(shù)據(jù)，當(dāng)然，其實(shí)這個方式和scan.addColumn差不多，并且它會匹配到多個列族。

ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter(Bytes.toBytes("name"));
Scan scan = new Scan();
scan.setFilter(columnPrefixFilter);

ResultScanner resultScanner = table.getScanner(scan);
printResult(resultScanner);

row_key正則
row_key正則查找也是hbase查找中經(jīng)常用到的功能。

//查找以指定內(nèi)容開頭的
Filter rowKeyFilter = new RowFilter(CompareFilter.CompareOp.EQUAL, new RegexStringComparator("^zhangsan_1239"));
Scan scan = new Scan();
scan.setFilter(rowKeyFilter);

ResultScanner resultScanner = table.getScanner(scan);
printResult(resultScanner);

組合過濾器
我們可能有這樣的情況，我們想掃描指定row key范圍的，又想指定address為深圳的，或者是更多的一些條件，這個時候我們就需要多個過濾器組合使用。

/*
* 我們需要注意Operator這個參數(shù)，這是一個枚舉類型，里面有兩個類型
*   Operator.MUST_PASS_ALL   需要通過全部的條件，也就是并且，and &&
*   Operator.MUST_PASS_ONE   任何一個條件滿足都可以，也就是或者，or ||
*/
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
//row key正則表達(dá)式的過濾器
Filter rowKeyFilter = new RowFilter(CompareFilter.CompareOp.EQUAL, new RegexStringComparator("^zhangsan_1239"));
//列值過濾器
SingleColumnValueFilter singleColumnValueFilter = new SingleColumnValueFilter(Bytes.toBytes("info1"), Bytes.toBytes("name"), CompareFilter.CompareOp.EQUAL, Bytes.toBytes("zhangsan9"));
//把兩個filter添加進(jìn)filterList中
filterList.addFilter(rowKeyFilter);
filterList.addFilter(singleColumnValueFilter);

Scan scan = new Scan();
scan.setFilter(filterList);

ResultScanner resultScanner = table.getScanner(scan);
printResult(resultScanner);

最后附上printResult方法：

private void printResult(ResultScanner resultScanner) {
  for (Result result : resultScanner) {
      byte[] address = result.getValue(Bytes.toBytes("info1"), Bytes.toBytes("address")); 
      byte[] name = result.getValue(Bytes.toBytes("info1"), Bytes.toBytes("name")); 
      byte[] sex = result.getValue(Bytes.toBytes("info1"), Bytes.toBytes("sex")); 
      byte[] age = result.getValue(Bytes.toBytes("info1"), Bytes.toBytes("age"));
      byte[] rowKey = result.getRow();
      System.out.print(Bytes.toString(rowKey) + ",");
      System.out.print(Bytes.toString(name) + ",");
      System.out.print(Bytes.toString(sex) + ",");
      System.out.print(Bytes.toString(address) + ",");
      System.out.print((age == null ? null : Bytes.toInt(age)) + ",");
      System.out.println();
   }
}

3.5 刪除數(shù)據(jù)
刪除數(shù)據(jù)使用的是Delete對象，我們可以刪除一行，或者刪除一個列族，或者刪除一個列族中的指定列：

//刪除一行
Delete deleteRow = new Delete(Bytes.toBytes("zhangsan_1235")); 

Delete deleteCol = new Delete(Bytes.toBytes("zhangsan_1235"));
//刪除該行的指定列
deleteCol.addFamily(Bytes.toBytes("info1"));列族
//刪除指定的一個單元
deleteCol.addColumn(Bytes.toBytes("info1"),Bytes.toBytes("name"));

table.delete(deleteCol);

table.delete(deleteRow);

//table.delete(List<Delete>); //通過添加一個list集合，可以刪除多個

3.6 刪除表

//刪除表之前需要禁用表
admin.disableTable(TableName.valueOf(tablename));
admin.deleteTable(TableName.valueOf(tablename));

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

HBase的使用

HBase的使用

一、安裝

二、HBase的基本操作

三、HBase API的使用

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

HBase的使用

一、安裝

二、HBase的基本操作

三、HBase API的使用

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

一、安裝

二、HBase的基本操作

三、HBase API的使用