上一篇介紹了mac下安裝hive,《mac環(huán)境下安裝hive》,寫完后有同學(xué)問hive的查詢速度太慢了,能不能接近實(shí)時檢索?大家知道hive提供了方便的類sql語言hql,但是底層仍然是通過mr進(jìn)行離線運(yùn)算(見下圖),業(yè)內(nèi)也有一些類似于impala的工具,但還是不能和關(guān)系數(shù)據(jù)庫媲美,這時就可以通過es和hive進(jìn)行整合來實(shí)現(xiàn)需求。希望本文能為大家?guī)韼椭?/p>

安裝elasticsearch
1、brew install elasticsearch
2、啟動elasticsearch ,執(zhí)行elasticsearch
3、訪問127.0.0.1:9200

hive整合es:
1、下載http://jcenter.bintray.com/org/elasticsearch/elasticsearch-hadoop/找到和es匹配的jar文件
2、hive -hiveconf hive.aux.jars.path=file:///usr/local/Cellar/hive/2.1.1/libexec/lib/elasticsearch-hadoop-5.5.2.jar
或者在hive-site.xml中增加:
hive.aux.jars.path
file:///usr/local/Cellar/hive/2.1.1/libexec/lib/elasticsearch-hadoop-5.5.2.jar
3、創(chuàng)建表
CREATE EXTERNAL TABLE employee? (id INT, name STRING) STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' TBLPROPERTIES('es.resource' = 'employees/list','es.index.auto.create' = 'true','es.nodes' = 'localhost','es.port' = '9200','es.mapping.id' = 'id','es.write.operation'='upsert');
4、創(chuàng)建源表
CREATE TABLE employee_source? (id INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
5、導(dǎo)入數(shù)據(jù)
LOAD DATA LOCAL INPATH '/Users/yinxiaokai/Documents/employee_source.log' OVERWRITE INTO TABLE employee_source;
employee_source.log文件內(nèi)容:
1,jim
2,kate
3,tom
4,mike
6、將源表數(shù)據(jù)導(dǎo)入測試表
INSERT OVERWRITE TABLE employee? SELECT s.id, s.name FROM employee_source s;
7、驗(yàn)證:
http://localhost:9200/employees/list/_search
補(bǔ)充:
es原生的展示不是很方便,建議同學(xué)們安裝head插件
1、 git clone git://github.com/mobz/elasticsearch-head.git
2、cd elasticsearch-head
3、brew install node
4、npm -g install grunt
5、修改es yml配置文件,增加:
http.cors.enabled: true
http.cors.allow-origin: "*"
6、grunt server
7、訪問http://localhost:9100
ok,至此hive和es的整合結(jié)束。