Hive數(shù)據(jù)倉庫--HiveQL視圖和索引

上篇文章中講完了查詢的相關(guān)操作,接下來就是視圖與索引了,Hive中的視圖的作用總的來說就是為了簡化查詢語句,是一個(gè)邏輯上的視圖,而不是物化的視圖。索引則是加快查詢速度的比較重要的手段,之前的Mysql優(yōu)化的文章中也講到了索引的使用,感覺概念上和Mysql數(shù)據(jù)庫中的操作基本上是相似的。

轉(zhuǎn)載請(qǐng)注明出處:Hive數(shù)據(jù)倉庫--HiveQL視圖和索引

視圖

創(chuàng)建這樣的一個(gè)視圖,高收入人群。

我試了下,這里的視圖并不會(huì)幫我們進(jìn)行存儲(chǔ)視圖所代表的查詢語句所包含的數(shù)據(jù)的,這里可以認(rèn)為他就是一個(gè)復(fù)雜的語句的簡化,是一個(gè)邏輯的視圖,而不是物化視圖,這里好像并沒有對(duì)效率進(jìn)行提升。視圖在這里是將Hive的行和列進(jìn)行的固化,但是并沒有對(duì)數(shù)據(jù)進(jìn)行固化,那么當(dāng)你刪除掉表中的列的時(shí)候,會(huì)造成視圖的錯(cuò)誤。

創(chuàng)建視圖語句

CREATE VIEW [IF NOT EXISTS] view_name [(column_name [COMMENT column_comment], ...) ][COMMENT view_comment][TBLPROPERTIES (property_name = property_value, ...)]AS SELECT ...

創(chuàng)建視圖

hive>    > create view salaries_high as    > select * from salaries_external where salary > 500000;OKTime taken: 1.227 secondshive> select * from salaries_high limit 10;Total jobs = 1Launching Job 1 out of 1Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1475147088438_0007, Tracking URL = http://hadoopwy1:8088/proxy/application_1475147088438_0007/Kill Command = /usr/local/hadoop2/bin/hadoop job  -kill job_1475147088438_0007Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 02016-09-29 05:37:02,617 Stage-1 map = 0%,  reduce = 0%2016-09-29 05:37:10,092 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.46 secMapReduce Total cumulative CPU time: 1 seconds 460 msecEnded Job = job_1475147088438_0007MapReduce Jobs Launched:Job 0: Map: 1   Cumulative CPU: 1.46 sec   HDFS Read: 4422 HDFS Write: 310 SUCCESSTotal MapReduce CPU Time Spent: 1 seconds 460 msecOK1985    BAL     AL      murraed02       1472819.01985    BAL     AL      lynnfr01        1090000.01985    BAL     AL      ripkeca01       800000.01985    BAL     AL      lacyle01        725000.01985    BAL     AL      flanami01       641667.01985    BAL     AL      boddimi01       625000.01985    BAL     AL      stewasa01       581250.01985    BAL     AL      martide01       560000.01985    BAL     AL      roeniga01       558333.01985    BAL     AL      mcgresc01       547143.0Time taken: 26.702 seconds, Fetched: 10 row(s)

刪除視圖

hive> drop view if exists salaries_high;OKTime taken: 1.043 seconds

索引

創(chuàng)建索引語句

CREATE INDEX index_name ON TABLE base_table_name (col_name, ...)AS 'index.handler.class.name'[WITH DEFERRED REBUILD][IDXPROPERTIES (property_name=property_value, ...)][IN TABLE index_table_name][PARTITIONED BY (col_name, ...)][   [ ROW FORMAT ...] STORED AS ...   | STORED BY ...][LOCATION hdfs_path][TBLPROPERTIES (...)][COMMENT "index comment"]

創(chuàng)建一個(gè)索引

索引表的

hive> create index yearindex on table salaries_external(yearid) as 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' with deferred rebuild in table salaries_external_index;OKTime taken: 0.475 seconds

僅僅索引的

hive> create index index_test on table salaries_external(yearid) AS  'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD ;OKTime taken: 0.278 seconds

查看索引

hive>    > show index on salaries_external;OKyearindex               salaries_external       yearid                  salaries_external_index compactindex_test              salaries_external       yearid                  default__salaries_external_index_test__ compactTime taken: 0.077 seconds, Fetched: 2 row(s)

改變索引

hive> alter index index_test on salaries_external rebuild;Total jobs = 1Launching Job 1 out of 1Number of reduce tasks not specified. Estimated from input data size: 1In order to change the average load for a reducer (in bytes):  set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers:  set hive.exec.reducers.max=<number>In order to set a constant number of reducers:  set mapreduce.job.reduces=<number>Starting Job = job_1475147088438_0009, Tracking URL = http://hadoopwy1:8088/proxy/application_1475147088438_0009/Kill Command = /usr/local/hadoop2/bin/hadoop job  -kill job_1475147088438_0009Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 12016-09-29 06:44:34,287 Stage-1 map = 0%,  reduce = 0%2016-09-29 06:45:02,611 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.2 sec2016-09-29 06:45:18,538 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 3.85 secMapReduce Total cumulative CPU time: 3 seconds 850 msecEnded Job = job_1475147088438_0009Loading data to table default.default__salaries_external_index_test__rmr: DEPRECATED: Please use 'rm -r' instead.Deleted hdfs://hadoopnodeservice1/user/hive/warehouse/default__salaries_external_index_test__Table default.default__salaries_external_index_test__ stats: [numFiles=1, numRows=58, totalSize=321107, rawDataSize=321049]MapReduce Jobs Launched:Job 0: Map: 1  Reduce: 1   Cumulative CPU: 3.85 sec   HDFS Read: 1354022 HDFS Write: 321214 SUCCESSTotal MapReduce CPU Time Spent: 3 seconds 850 msecOKTime taken: 58.187 seconds

刪除索引

hive>    > drop index index_test on salaries_external;OKTime taken: 0.188 secondshive> show index on salaries_external;OKyearindex               salaries_external       yearid                  salaries_external_index compactTime taken: 0.065 seconds, Fetched: 1 row(s)

參考:https://www.yiibai.com/hive/hive_views_and_indexes.html

轉(zhuǎn)載請(qǐng)注明出處:Hive數(shù)據(jù)倉庫--HiveQL視圖和索引

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容