DDL

1.庫(kù)

建庫(kù)：

> create database if not exists 庫(kù)名;

還有一個(gè)方式：

> create database if not exists 庫(kù)名 location 路徑;

指定hdfs路徑

查看數(shù)據(jù)庫(kù)：

> show databases;

看數(shù)據(jù)庫(kù)信息：

> desc databases 庫(kù)名;

想多看點(diǎn)：

> desc databases extended 庫(kù)名;

改庫(kù)：(數(shù)據(jù)庫(kù)名和數(shù)據(jù)庫(kù)目錄位置無(wú)法修改)

> alter database 庫(kù)名 set ...

刪庫(kù):（想跑路？）
空庫(kù)：

> drop database if exists 庫(kù)名;

非空庫(kù)：

> drop database 庫(kù)名 cascade;

2.表

建表：

> create [external] table [if not exist] table_name [(col_name data_type [comment col_comment],...)]
[comment table_comment]
[partitioned by (col_name data_type[comment col_comment],...)
[clustered by (col_name,col_name...)
[sorted by (col_name [asc|desc],...)] into num_buckets buckets]
[row format row_format]
[stored as file_format]
[Location hdfs_path]

語(yǔ)句特別長(zhǎng)，中括號(hào)里面的不是必要的，但是大多數(shù)情況下的建表都是分區(qū)表。
1.create table 就是建表基礎(chǔ)。
2.external：關(guān)鍵字是可以創(chuàng)建于一個(gè)外部表，建表同時(shí)指定一個(gè)指向?qū)嶋H數(shù)據(jù)的路徑(location)。
hive創(chuàng)建內(nèi)部表時(shí)，會(huì)將數(shù)據(jù)移動(dòng)到數(shù)據(jù)倉(cāng)庫(kù)指向的路徑；
如果創(chuàng)建外部表，僅記錄數(shù)據(jù)所在的路徑，不對(duì)數(shù)據(jù)的位置做任何改變。
區(qū)別就在于刪除表的時(shí)候。內(nèi)部表元數(shù)據(jù)和數(shù)據(jù)都會(huì)被刪除，外部表只刪除元數(shù)據(jù)，不刪除數(shù)據(jù)。
3.comment：添加注釋(對(duì)表、對(duì)列)
4.partitioned by：創(chuàng)建分區(qū)表
5.clustered by：創(chuàng)建分桶表
6.sorted by ：排序，基本不用
7.row format：數(shù)據(jù)切分形式
8.stored as ：指定文件存儲(chǔ)類型
9.location：指定表在hdfs的位置
10.like ：允許用戶復(fù)制現(xiàn)有變結(jié)構(gòu)，但不復(fù)制數(shù)據(jù)

我們使用create table的時(shí)候，基本上hive幫我們補(bǔ)充了其他參數(shù)的默認(rèn)值。

管理表

默認(rèn)創(chuàng)建的表都是管理表，有時(shí)候也稱為內(nèi)部表。這種表，hive會(huì)控制數(shù)據(jù)的生命周期，我們刪除表的時(shí)候，hive也會(huì)刪除表中的數(shù)據(jù)。
外部表更安全一些，畢竟數(shù)據(jù)是分開(kāi)的。刪表之后數(shù)據(jù)會(huì)保留。

查詢表的類型

> desc formatted 表名

能發(fā)現(xiàn)Table_type，是表的類型。可以修改

內(nèi)部表改為外部表

> alert table 表名 set tblproperties ('EXTERNAL'='TRUE')

外部表改為內(nèi)部表

> alert table 表名 set tblproperties ('EXTERNAL'='FALSE')

MANAGED_TABLE是內(nèi)部表，EXTERNAL_TABLE是外部表。
注意！??！('EXTERNAL'='TRUE')和 ('EXTERNAL'='FALSE')這個(gè)要區(qū)分大小寫?。。?/p>

3.分區(qū)

分區(qū)的意義：分區(qū)表實(shí)際上來(lái)說(shuō)，就是對(duì)應(yīng)HDFS文件系統(tǒng)上獨(dú)立的文件夾，該文件夾下是該分區(qū)所有數(shù)據(jù)文件。Hive中的分區(qū)就是分目錄，把一個(gè)大的數(shù)據(jù)集根據(jù)業(yè)務(wù)需要，分割成小的數(shù)據(jù)集。在查詢的時(shí)候，通過(guò)where子句中的表達(dá)式，選擇查詢所需要的指定分區(qū)。提高查詢效率(謂詞下推)。

3.1 引入分區(qū)表（根據(jù)日期對(duì)日志進(jìn)行管理）

/user/hive/warehouse/log_partition/20191028/20191028.log
/user/hive/warehouse/log_partition/20191029//20191029.log
/user/hive/warehouse/log_partition/20191030//20191030.log

3.2 創(chuàng)建分區(qū)表

> create table dept_partition(deptno int,dname string,loc string) 
partitioned by (month string)
row format delimited fields terminated by '\t';

3.3 加載數(shù)據(jù)到分區(qū)表中

> load data local inpath '/mnt/hgfs/shareOS/dept.txt' into table dept_partition partition(month='2019-06');

> load data local inpath '/mnt/hgfs/shareOS/dept.txt' into table dept_partition partition(month='2019-07');

看到生成了兩個(gè)文件，select的時(shí)候發(fā)現(xiàn)，查詢的是兩個(gè)文件和并的數(shù)據(jù)。并且?guī)臀覀冏詣?dòng)添加了month，并且它可以作為查詢條件。這個(gè)條件區(qū)分了文件夾，查詢的時(shí)候會(huì)被先訪問(wèn)。
我們?nèi)ysql看元數(shù)據(jù)。PARTITION表，增加了兩條數(shù)據(jù)。這個(gè)記錄了我們的分區(qū)。

3.4 查詢分區(qū)表中數(shù)據(jù)

單分區(qū)查詢：

> select * from dept_partition where month = '2019-06';

多分區(qū)聯(lián)合查詢：

> select * from dept_partition where month = '2019-06' 
    union
        select * from dept_partition where month = '2019-07' 
            union
                select * from dept_partition where month = '2019-08' ;

3.5 增加分區(qū)

創(chuàng)建單個(gè)分區(qū)：

> alert table dept_partition add partition(month=''2019-09);

多個(gè)分區(qū)：

> alert table dept_partition add partition(month=''2019-10) partition(month=''2019-11);

3.6 刪除分區(qū)

> alter table dept_partition drop partition (month='2019-10');

刪多個(gè)：

> alter table dept_partition drop partition (month='2019-10'),partition (month='2019-11');

3.7 查看分區(qū)表有多少個(gè)分區(qū)

> show partitions dept_partition;

3.8 分區(qū)表結(jié)構(gòu)查看

> desc formatted dept_partition;

多了一個(gè)分區(qū)字段信息。把分區(qū)字段當(dāng)普通字段使用就可以了。

4.分區(qū)表注意事項(xiàng)

4.1 二級(jí)分區(qū)

> create table dept_partition2(deptno int,dname string,loc string) 
partitioned by (month string，day string)
row format delimited fields terminated by '\t';

4.2 添加的時(shí)候，兩個(gè)也都要填。

> load data local inpath '/mnt/hgfs/shareOS/dept.txt' into table dept_partition partition(month='2019-06',day='10');

4.3 把數(shù)據(jù)直接上傳到分區(qū)目錄上，讓分區(qū)表和數(shù)據(jù)產(chǎn)生關(guān)聯(lián)的三種方式。

（1）上傳數(shù)據(jù)之后修復(fù)

> msck repair table dept_partition;

（2）上傳數(shù)據(jù)之后添加分區(qū)

> alter table dept_partition add partition(month'2019-11');

（3）創(chuàng)建文件夾之后load數(shù)據(jù)到分區(qū)。
load添加了分區(qū)信息，上傳了數(shù)據(jù)，問(wèn)題直接就解決了。

五.修改表

5.1 改表名

> alter table 表名 RENAME TO new_table_name;

5.3 增加、修改、替換列信息

更新列：

> alter table 表名 change [COLUMN] col_old_name col_new_name 
column_type [COMMENT col_comment] [FIRST|AFTER column_name]

增加和替換列：

> alter table 表名 add|replace COLUMNS 
(col_name data_type [COMMENT col_comment],...)

ADD是增加一個(gè)字段，REPLACE是替換表中所有字段。

接下來(lái)是部分DML

一.數(shù)據(jù)導(dǎo)入

1.1.1 向表中裝載數(shù)據(jù)(Load)

語(yǔ)法：

> load data [local] inpath '/opt/soft/文件.txt' overwrite|into 
table 表名 [partition(partcol1=val1,...)];

load data：加載數(shù)據(jù)
local：表示從本地加載到hive表；否則從HDFS加載到Hive
inpath：加載數(shù)據(jù)路徑
overwrite：表示覆蓋表中已有數(shù)據(jù)，否則表示追加。
into table：表示加載到哪張表
partition：上傳到指定分區(qū)

1.1.2 通過(guò)查詢語(yǔ)句插入數(shù)據(jù)(insert)

> insert into table 表名 partition(month='2019-06') value (1,'zahngsan');

into也可以變成overwrite，基本模式插入。
還有多插入模式之類的。。。不過(guò)不常用。

1.1.3 查詢語(yǔ)句中創(chuàng)建并加載數(shù)據(jù)(As Select)

1.創(chuàng)建一張分區(qū)表

例如：

> create table if not exist 表名(id int,name string) partitioned by 
(month string) row format delimited fields terminated by '\t';

2.基本插入數(shù)據(jù)

例如：

> insert into table student partition(month='2019-07') values(1,'wangwu');

3.基本模式插入(覆蓋)
例如：

> insert overwrite table student partition(month='2019-07') values(1,'wangwu');

二.數(shù)據(jù)導(dǎo)出

2.1 Insert導(dǎo)出

1.結(jié)果導(dǎo)出到本地

> insert overwrite local directory '/opt/module/datas/export/student'
select * from student;

哎，由于我虛擬機(jī)分配內(nèi)存比較小，系統(tǒng)慢的要死！讓我一度以為集群搭建的有問(wèn)題！?。?/p>

2.將結(jié)果格式化導(dǎo)出到本地

>insert overwrite local directory '/opt/module/datas/export/student'  
row format delimited fields terminated by '\t'
select * from student;

3.導(dǎo)出到hdfs

>insert overwrite directory '/dept'  
row format delimited fields terminated by '\t'
select * from student;

2.2 Hadoop導(dǎo)出
2.3 Export導(dǎo)出到HDFS
2.4 SQOOP導(dǎo)出
MySQL 與 Hive(HDFS) 之間導(dǎo)來(lái)導(dǎo)去。。。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Hive 的DDL和DML

Hive 的DDL和DML

DDL

1.庫(kù)

2.表

管理表

3.分區(qū)

3.1 引入分區(qū)表（根據(jù)日期對(duì)日志進(jìn)行管理）

3.2 創(chuàng)建分區(qū)表

3.3 加載數(shù)據(jù)到分區(qū)表中

3.4 查詢分區(qū)表中數(shù)據(jù)

3.5 增加分區(qū)

3.6 刪除分區(qū)

3.7 查看分區(qū)表有多少個(gè)分區(qū)

4.分區(qū)表注意事項(xiàng)

4.1 二級(jí)分區(qū)

4.2 添加的時(shí)候，兩個(gè)也都要填。

4.3 把數(shù)據(jù)直接上傳到分區(qū)目錄上，讓分區(qū)表和數(shù)據(jù)產(chǎn)生關(guān)聯(lián)的三種方式。

五.修改表

5.1 改表名

5.3 增加、修改、替換列信息

接下來(lái)是部分DML

一.數(shù)據(jù)導(dǎo)入

1.1.1 向表中裝載數(shù)據(jù)(Load)

1.1.2 通過(guò)查詢語(yǔ)句插入數(shù)據(jù)(insert)

1.1.3 查詢語(yǔ)句中創(chuàng)建并加載數(shù)據(jù)(As Select)

1.創(chuàng)建一張分區(qū)表

2.基本插入數(shù)據(jù)

二.數(shù)據(jù)導(dǎo)出

2.1 Insert導(dǎo)出

2.將結(jié)果格式化導(dǎo)出到本地

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Hive 的DDL和DML

DDL

1.庫(kù)

2.表

管理表

3.分區(qū)

3.1 引入分區(qū)表（根據(jù)日期對(duì)日志進(jìn)行管理）

3.2 創(chuàng)建分區(qū)表

3.3 加載數(shù)據(jù)到分區(qū)表中

3.4 查詢分區(qū)表中數(shù)據(jù)

3.5 增加分區(qū)

3.6 刪除分區(qū)

3.7 查看分區(qū)表有多少個(gè)分區(qū)

4.分區(qū)表注意事項(xiàng)

4.1 二級(jí)分區(qū)

4.2 添加的時(shí)候，兩個(gè)也都要填。

4.3 把數(shù)據(jù)直接上傳到分區(qū)目錄上，讓分區(qū)表和數(shù)據(jù)產(chǎn)生關(guān)聯(lián)的三種方式。

五.修改表

5.1 改表名

5.3 增加、修改、替換列信息

接下來(lái)是部分DML

一.數(shù)據(jù)導(dǎo)入

1.1.1 向表中裝載數(shù)據(jù)(Load)

1.1.2 通過(guò)查詢語(yǔ)句插入數(shù)據(jù)(insert)

1.1.3 查詢語(yǔ)句中創(chuàng)建并加載數(shù)據(jù)(As Select)

1.創(chuàng)建一張分區(qū)表

2.基本插入數(shù)據(jù)

二.數(shù)據(jù)導(dǎo)出

2.1 Insert導(dǎo)出

2.將結(jié)果格式化導(dǎo)出到本地

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

4.2 添加的時(shí)候，兩個(gè)也都要填。

4.3 把數(shù)據(jù)直接上傳到分區(qū)目錄上，讓分區(qū)表和數(shù)據(jù)產(chǎn)生關(guān)聯(lián)的三種方式。

5.3 增加、修改、替換列信息