大數(shù)據(jù)開發(fā)之Hive篇6-Hive函數(shù)詳解

備注:
Hive 版本 2.1.1

一.Hive函數(shù)概述

1.1 Hive函數(shù)分類

  1. 函數(shù)/UDF
    輸入一行記錄,輸出一行記錄
    示例:upper/lower/length

  2. 聚集函數(shù)/UDAF
    輸入多行記錄,輸出一行記錄
    示例:sum/count/avg

  3. 表生成函數(shù)/UDTF
    輸入一行記錄,輸出多行記錄
    示例:explode

1.2 查看Hive函數(shù)

1.2.1 show functions命令

show functions命令是查詢當(dāng)前hive支持哪些函數(shù),可以看到hive總共支持248個(gè)函數(shù),總體來說函數(shù)與MySQL的比較類似,但是也包含 to_date、decode等Oracle獨(dú)有的函數(shù)(函數(shù)用法與Oracle存在差別)。真的是給從關(guān)系型數(shù)據(jù)庫轉(zhuǎn)到大數(shù)據(jù)庫開發(fā)的人提供了便利。

hive> SHOW FUNCTIONS;
OK
!
!=
$sum0
%
&
*
+
-
/
<
<=
<=>
<>
=
==
>
>=
^
abs
acos
add_months
and
array
array_contains
ascii
asin
assert_true
atan
avg
base64
between
bin
bround
case
cbrt
ceil
ceiling
chr
coalesce
collect_list
collect_set
compute_stats
concat
concat_ws
context_ngrams
conv
corr
cos
count
covar_pop
covar_samp
crc32
create_union
cume_dist
current_database
current_date
current_timestamp
current_user
date_add
date_format
date_sub
datediff
day
dayofmonth
dayofweek
decode
degrees
dense_rank
div
e
elt
encode
ewah_bitmap
ewah_bitmap_and
ewah_bitmap_empty
ewah_bitmap_or
exp
explode
factorial
field
find_in_set
first_value
floor
floor_day
floor_hour
floor_minute
floor_month
floor_quarter
floor_second
floor_week
floor_year
format_number
from_unixtime
from_utc_timestamp
get_json_object
get_splits
greatest
grouping
hash
hex
histogram_numeric
hour
if
in
in_file
index
initcap
inline
instr
internal_interval
isnotnull
isnull
java_method
json_tuple
lag
last_day
last_value
lcase
lead
least
length
levenshtein
like
ln
locate
log
log10
log2
logged_in_user
lower
lpad
ltrim
map
map_keys
map_values
mask
mask_first_n
mask_hash
mask_last_n
mask_show_first_n
mask_show_last_n
matchpath
max
md5
min
minute
month
months_between
named_struct
negative
next_day
ngrams
noop
noopstreaming
noopwithmap
noopwithmapstreaming
not
ntile
nvl
or
parse_url
parse_url_tuple
percent_rank
percentile
percentile_approx
pi
pmod
posexplode
positive
pow
power
printf
quarter
radians
rand
rank
reflect
reflect2
regexp
regexp_extract
regexp_replace
repeat
replace
reverse
rlike
round
row_number
rpad
rtrim
second
sentences
sha
sha1
sha2
shiftleft
shiftright
shiftrightunsigned
sign
sin
size
sort_array
soundex
space
split
sqrt
stack
std
stddev
stddev_pop
stddev_samp
str_to_map
struct
substr
substring
substring_index
sum
tan
to_date
to_unix_timestamp
to_utc_timestamp
translate
trim
trunc
ucase
unbase64
unhex
unix_timestamp
upper
uuid
var_pop
var_samp
variance
version
weekofyear
when
windowingtablefunction
xpath
xpath_boolean
xpath_double
xpath_float
xpath_int
xpath_long
xpath_number
xpath_short
xpath_string
year
|
~
Time taken: 1.246 seconds, Fetched: 248 row(s)
hive> 

1.2.2 DESCRIBE 命令

語法:

DESCRIBE FUNCTION <function_name>;
DESCRIBE FUNCTION EXTENDED <function_name>;

測(cè)試記錄:

hive> 
    > 
    > desc function decode;
OK
decode(bin, str) - Decode the first argument using the second argument character set
Time taken: 0.023 seconds, Fetched: 1 row(s)
hive> desc function extended decode;
OK
decode(bin, str) - Decode the first argument using the second argument character set
Possible options for the character set are 'US-ASCII', 'ISO-8859-1',
'UTF-8', 'UTF-16BE', 'UTF-16LE', and 'UTF-16'. If either argument
is null, the result will also be null
Time taken: 0.027 seconds, Fetched: 4 row(s)
hive> 

二.Hive的數(shù)值函數(shù)

下列是Hive常見的數(shù)值函數(shù):

函數(shù)名 描述 返回值類型
round(DOUBLE a)
round(DOUBLE a, INT d)
四舍五入函數(shù) DOUBLE
floor(DOUBLE a) 返回小于等于a的整數(shù) BIGINT
ceil(DOUBLE a), ceiling(DOUBLE a) 返回大于等于a的整數(shù) BIGINT
rand(), rand(INT seed) 返回0 到1 之間的隨機(jī)數(shù) DOUBLE
pow(DOUBLE a, DOUBLE p), power(DOUBLE a, DOUBLE p) 返回a的p次方 DOUBLE
sqrt(DOUBLE a), sqrt(DECIMAL a) 開方函數(shù) DOUBLE
bin(BIGINT a) 返回二進(jìn)制數(shù) STRING
abs(DOUBLE a) 返回a的絕對(duì)值 DOUBLE
greatest(T v1, T v2, ...) 返回最大值 T
least(T v1, T v2, ...) 返回最小值 T

2.1 round函數(shù)

round函數(shù),用于四舍五入。

測(cè)試記錄:

hive> 
    > select round(5.1);
OK
5.0
Time taken: 0.09 seconds, Fetched: 1 row(s)
hive> select round(5.5);
OK
6.0
Time taken: 0.07 seconds, Fetched: 1 row(s)
hive> 
    > select round(5.5,1);
OK
5.5
Time taken: 0.086 seconds, Fetched: 1 row(s)

2.2 floor函數(shù)

floor向下取整

測(cè)試記錄:

hive> 
    > 
    > select floor(5.4);
OK
5
Time taken: 0.092 seconds, Fetched: 1 row(s)
hive> select floor(5.5);
OK
5
Time taken: 0.074 seconds, Fetched: 1 row(s)
hive> select floor(5.9);
OK
5
Time taken: 0.08 seconds, Fetched: 1 row(s)
hive> 

2.3 ceil函數(shù)

ceil 向上取整

測(cè)試記錄:

hive> 
    > select ceil(5.4);
OK
6
Time taken: 0.094 seconds, Fetched: 1 row(s)
hive> select ceil(5.1);
OK
6
Time taken: 0.063 seconds, Fetched: 1 row(s)
hive> select ceil(5.9);
OK
6
Time taken: 0.08 seconds, Fetched: 1 row(s)

2.4 rand()

rand 隨機(jī)數(shù)函數(shù)

測(cè)試記錄:

-- 取值1-100之間的隨機(jī)數(shù)
hive> 
    > select round(100*rand());
Query ID = root_20201208110850_c85b86e9-012e-449d-a1f5-0b158741315b
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0141, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0141/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0141
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-08 11:08:57,146 Stage-1 map = 0%,  reduce = 0%
2020-12-08 11:09:03,404 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.04 sec
MapReduce Total cumulative CPU time: 3 seconds 40 msec
Ended Job = job_1606698967173_0141
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 3.04 sec   HDFS Read: 5300 HDFS Write: 104 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 3 seconds 40 msec
OK
62.0
Time taken: 14.362 seconds, Fetched: 1 row(s)

2.5 power函數(shù)

power平方函數(shù)

測(cè)試記錄:

-- 求2的3次方
hive> 
    > select power(2,3);
OK
8.0
Time taken: 0.087 seconds, Fetched: 1 row(s)

2.6 sqrt函數(shù)

sqrt開方函數(shù)

測(cè)試記錄:

hive> 
    > select sqrt(9);
OK
3.0
Time taken: 0.074 seconds, Fetched: 1 row(s)

2.7 bin

bin函數(shù),返回二進(jìn)制數(shù)

測(cè)試記錄:

hive> 
    > 
    > select bin(10);
OK
1010
Time taken: 0.085 seconds, Fetched: 1 row(s)
hive> select bin(9);
OK
1001
Time taken: 0.065 seconds, Fetched: 1 row(s)
hive> 

2.8 abs函數(shù)

abc函數(shù)求絕對(duì)值

測(cè)試記錄:

hive> 
    > select abs(10);
OK
10
Time taken: 0.09 seconds, Fetched: 1 row(s)
hive> select abs(-10);
OK
10
Time taken: 0.084 seconds, Fetched: 1 row(s)

2.9 greatest函數(shù)

greatest函數(shù),求最大值函數(shù),如果有參數(shù)為null,就會(huì)返回null

測(cè)試記錄:

hive> 
    > select greatest(1,2,3);
OK
3
Time taken: 0.094 seconds, Fetched: 1 row(s)
hive> select greatest(1,2,3,1.5);
OK
3.0
Time taken: 0.075 seconds, Fetched: 1 row(s)
hive> 
    > select greatest(1,2,3,null);
OK
NULL
Time taken: 0.084 seconds, Fetched: 1 row(s)

2.10 least函數(shù)

least函數(shù)求最小值,如參數(shù)中有null值,則返回null

測(cè)試記錄:

hive> 
    > 
    > select least(1,2,3);
OK
1
Time taken: 0.094 seconds, Fetched: 1 row(s)
hive> select least(1,2,3,-1);
OK
-1
Time taken: 0.079 seconds, Fetched: 1 row(s)
hive> select least(1,2,3,null);
OK
NULL
Time taken: 0.059 seconds, Fetched: 1 row(s)
hive> 

三.Hive的日期時(shí)間函數(shù)

下列是Hive常見的日期時(shí)間函數(shù):

函數(shù)名 描述 返回值類型
from_unixtime(bigint unixtime[, string format]) 與UNIX_TIMESTAMP()結(jié)合使用,數(shù)值轉(zhuǎn)換成日期時(shí)間類型 string
unix_timestamp()
unix_timestamp(string date)
unix_timestamp(string date, string pattern)
將時(shí)間轉(zhuǎn)成數(shù)字,與FROM_UNIXTIME相互對(duì)應(yīng) bigint
to_date(string timestamp) 返回時(shí)間戳類型的日期時(shí)間部分 date
year(string date) 返回年份 int
quarter(date/timestamp/string) 返回季度 int
month(string date) 返回月份 int
day(string date) dayofmonth(date) 返回日 int
hour(string date) 返回小時(shí) int
minute(string date) 返回分鐘 int
second(string date) 返回秒 int
weekofyear(string date) 返回當(dāng)年的星期數(shù) int
extract(field FROM source) 提取日期的部分 int
datediff(string enddate, string startdate) 返回兩個(gè)日期間隔的天數(shù) int
date_add(date/timestamp/string startdate, tinyint/smallint/int days) 增加日期天數(shù) date
date_sub(date/timestamp/string startdate, tinyint/smallint/int days) 減去日期天數(shù) date
from_utc_timestamp({any primitive type} ts, string timezone) 將UTC中的時(shí)間戳*轉(zhuǎn)換為給定時(shí)區(qū) timestamp
to_utc_timestamp({any primitive type} ts, string timezone) 將給定時(shí)區(qū)中的時(shí)間戳*轉(zhuǎn)換為UTC timestamp
current_date 返回當(dāng)前日期 date
current_timestamp 返回當(dāng)前時(shí)間 timestamp
add_months(string start_date, int num_months, output_date_format) 增加月份 string
last_day(string date) 求本月最后一天 string
next_day(string start_date, string day_of_week) 返回比start_date晚并命名為day_of_week的第一個(gè)日期 string
trunc(string date, string format) 返回格式指定的單元截?cái)嗟娜掌?/td> string
months_between(date1, date2) 兩個(gè)日期之間間隔的月份 double
date_format(date/timestamp/string ts, string fmt) 格式日期指定 string

3.1 from_unixtime和UNIX_TIMESTAMP

from_unixtime 將數(shù)字轉(zhuǎn)成時(shí)間,與UNIX_TIMESTAMP相互對(duì)應(yīng)
UNIX_TIMESTAMP將時(shí)間轉(zhuǎn)成數(shù)字,與FROM_UNIXTIME相互對(duì)應(yīng)
測(cè)試記錄:

hive> 
    > select UNIX_TIMESTAMP('2020-12-09 23:50:59');
OK
1607529059
Time taken: 0.101 seconds, Fetched: 1 row(s)
hive> 
    > select from_unixtime(1607529059);
OK
2020-12-09 23:50:59
Time taken: 0.084 seconds, Fetched: 1 row(s)

3.2 to_date函數(shù)

to_date(string timestamp) 函數(shù) (string timestamp)

hive> 
    > 
    > select to_date('2020-12-09 23:59:59 123456789');
OK
2020-12-09
Time taken: 0.109 seconds, Fetched: 1 row(s)
hive> select to_date('2020-12-09 23:59:59');
OK
2020-12-09
Time taken: 0.086 seconds, Fetched: 1 row(s)

3.3 year/quarter/month/day/hour/minute/second/weekofyear

  1. year(string date) 返回年份
  2. quarter(date/timestamp/string) 返回季度
  3. month(string date) 返回月份
  4. day(string date) dayofmonth(date) 返回日
  5. hour(string date) 返回小時(shí)
  6. minute(string date) 返回分鐘
  7. second(string date) 返回秒

測(cè)試記錄:

hive> 
    > select year(dt) dt_year,
    >        quarter(dt) dt_quarter,
    >        month(dt) dt_month,
    >        day(dt) dt_day,
    >        hour(dt) dt_hour,
    >        minute(dt) dt_minute,
    >        second(dt) dt_second
    > from
    > (
    > select '2020-12-09 23:50:59' as dt
    > ) tmp;
OK
dt_year dt_quarter      dt_month        dt_day  dt_hour dt_minute       dt_second
2020    4       12      9       23      50      59
Time taken: 0.098 seconds, Fetched: 1 row(s)
hive> 

3.4 weekofyear 函數(shù)

weekofyear(string date) 返回當(dāng)年的星期數(shù)

測(cè)試記錄:
'2020-12-09'是本年的第50周

hive> 
    > select weekofyear('2020-12-09');
OK
_c0
50
Time taken: 0.073 seconds, Fetched: 1 row(s)

3.5 extract 函數(shù)

extract(field FROM source) 提取日期的部分
代碼:

-- 提取年
SELECT EXTRACT(YEAR FROM '2020-12-09');
-- 提取月
SELECT EXTRACT(MONTH FROM '2020-12-09');
-- 提取日
SELECT EXTRACT(DAY FROM '2020-12-09');
-- 提取小時(shí)
SELECT EXTRACT(HOUR FROM '2020-12-09 13:02:01');
-- 提取分鐘
SELECT EXTRACT(MINUTE FROM '2020-12-09 13:02:01');
-- 提取秒
SELECT EXTRACT(SECOND FROM '2020-12-09 13:02:01');

測(cè)試記錄:

hive> 
    > SELECT EXTRACT(YEAR FROM '2020-12-09');
OK
_c0
2020
Time taken: 1.85 seconds, Fetched: 1 row(s)
hive> SELECT EXTRACT(MONTH FROM '2020-12-09');
OK
_c0
12
Time taken: 0.227 seconds, Fetched: 1 row(s)
hive> SELECT EXTRACT(DAY FROM '2020-12-09');
OK
_c0
9
Time taken: 0.089 seconds, Fetched: 1 row(s)
hive> SELECT EXTRACT(HOUR FROM '2020-12-09 13:02:01');
OK
_c0
13
Time taken: 0.097 seconds, Fetched: 1 row(s)
hive> SELECT EXTRACT(MINUTE FROM '2020-12-09 13:02:01');
OK
_c0
2
Time taken: 0.081 seconds, Fetched: 1 row(s)
hive> SELECT EXTRACT(SECOND FROM '2020-12-09 13:02:01');
OK
_c0
1

3.6 datediff 函數(shù)

datediff(string enddate, string startdate) 返回兩個(gè)日期間隔的天數(shù)

測(cè)試記錄:

hive> 
    > 
    > select datediff('2020-12-09','2020-11-09');
OK
_c0
30
Time taken: 0.101 seconds, Fetched: 1 row(s)
hive> select datediff('2020-12-09 23:50:59','2020-11-09');
OK
_c0
30
Time taken: 0.113 seconds, Fetched: 1 row(s)

3.6 date_add/date_sub 函數(shù)

date_add(date/timestamp/string startdate, tinyint/smallint/int days) 增加日期間隔,默認(rèn)為天
date_sub(date/timestamp/string startdate, tinyint/smallint/int days) 減去日期間隔,默認(rèn)為天

測(cè)試記錄:

hive> 
    > 
    > SELECT DATE_ADD('2020-01-01', 31);
OK
_c0
2020-02-01
Time taken: 0.098 seconds, Fetched: 1 row(s)
hive> SELECT DATE_ADD('2020-01-01', -31);
OK
_c0
2019-12-01
Time taken: 0.075 seconds, Fetched: 1 row(s)
hive> 
    > SELECT DATE_SUB('2020-01-01', 31);
OK
_c0
2019-12-01
Time taken: 0.075 seconds, Fetched: 1 row(s)
hive> SELECT DATE_SUB('2020-01-01', -31);
OK
_c0
2020-02-01
Time taken: 0.071 seconds, Fetched: 1 row(s)

3.7 from_utc_timestamp/to_utc_timestamp 函數(shù)

from_utc_timestamp({any primitive type} ts, string timezone) 將UTC中的時(shí)間戳轉(zhuǎn)換為給定時(shí)區(qū)
to_utc_timestamp({any primitive type} ts, string timezone) 將給定時(shí)區(qū)中的時(shí)間戳
轉(zhuǎn)換為UTC

測(cè)試記錄:
CST是美國時(shí)間簡(jiǎn)寫
UTC的

hive> 
    > select from_utc_timestamp('2020-12-09 23:59:59','CST');
OK
_c0
2020-12-09 17:59:59
Time taken: 0.078 seconds, Fetched: 1 row(s)
hive> 
    > select to_utc_timestamp('2020-12-09 23:59:59','CST');
OK
_c0
2020-12-10 05:59:59
Time taken: 0.085 seconds, Fetched: 1 row(s)

3.8 current_date/current_timestamp

current_date 返回當(dāng)前日期
current_timestamp 返回當(dāng)前時(shí)間

測(cè)試記錄:

hive> 
    > select current_date() cur_date,current_timestamp() cur_timestamp;
OK
cur_date        cur_timestamp
2020-12-09      2020-12-09 14:58:44.535
Time taken: 0.079 seconds, Fetched: 1 row(s)

3.9 add_months 函數(shù)

add_months(string start_date, int num_months, output_date_format) 增加月份

測(cè)試記錄:

hive> 
    > 
    > select add_months('2020-12-31 23:59:59',2,'YYYY-MM-dd HH:mm:ss');
OK
_c0
2021-02-28 23:59:59
Time taken: 0.097 seconds, Fetched: 1 row(s)

3.10 last_day 函數(shù)

last_day(string date) 求本月最后一天

測(cè)試記錄:

hive> 
    > select last_day('2020-12-09');
OK
_c0
2020-12-31
Time taken: 0.104 seconds, Fetched: 1 row(s)
hive> select last_day('2020-12-09 23:59:59');
OK
_c0
2020-12-31
Time taken: 0.077 seconds, Fetched: 1 row(s)

3.10 next_day函數(shù)

next_day(string start_date, string day_of_week) 返回比start_date晚并命名為day_of_week的第一個(gè)日期

測(cè)試記錄:
求下一個(gè)禮拜二

hive> 
    > 
    > select next_day('2020-12-09','TU');
OK
_c0
2020-12-15
Time taken: 0.086 seconds, Fetched: 1 row(s)

3.11 trunc 函數(shù)

trunc(string date, string format) 返回格式指定的單元截?cái)嗟娜掌?br> 支持格式:MONTH/MON/MM, YEAR/YYYY/YY

測(cè)試記錄:

hive> 
    > select trunc('2020-12-09 23:59:59','YYYY');
OK
_c0
2020-01-01
hive> 
    > select trunc('2020-12-09 23:59:59','MM');
OK
_c0
2020-12-01
Time taken: 0.063 seconds, Fetched: 1 row(s)

3.12 months_between 函數(shù)

months_between(date1, date2) 兩個(gè)日期之間間隔的月份

date1和date2之間的月份間隔,會(huì)有小數(shù)的存在,另外,如果date1小于date2,那么就會(huì)出現(xiàn)負(fù)數(shù)

測(cè)試記錄:

hive> 
    > 
    > select months_between('2020-12-09','2020-01-01');
OK
_c0
11.25806452
Time taken: 0.09 seconds, Fetched: 1 row(s)
hive> 
    > select months_between('2020-12-09','2021-01-01');
OK
_c0
-0.74193548
Time taken: 0.083 seconds, Fetched: 1 row(s)
hive> 

3.13 date_format 函數(shù)

date_format(date/timestamp/string ts, string fmt) 格式日期指定

測(cè)試記錄:

hive> SELECT date_format('2020-12-09 23:59:59', 'yyyy.MM.dd HH:mm:ss');
OK
_c0
2020.12.09 23:59:59
Time taken: 0.071 seconds, Fetched: 1 row(s)

四.條件函數(shù)

下列是Hive常見的條件函數(shù):

函數(shù)名 描述 返回值類型
nvl(a,b) 當(dāng)a為null的時(shí)候返回b,否則返回a T
isnull( a ) 當(dāng)a為null的時(shí)候返回true,否則返回false boolean
isnotnull ( a ) 當(dāng)a為null的時(shí)候返回false,否則返回true boolean
if(boolean testCondition, T valueTrue, T valueFalseOrNull) 當(dāng)testCondition為true 返回 valueTrue,否則返回 valueFalseOrNull) T
COALESCE(T v1, T v2, ...) 返回第一個(gè)不為null,當(dāng)所有的都是null,就返回null T
CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END case條件判斷語句 T
CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END case值判斷語句 T

4.1 nvl函數(shù)

nvl(a,b) 當(dāng)a為null的時(shí)候返回b,否則返回a

測(cè)試記錄:

hive> 
    > select nvl(null,123);
OK
_c0
123
hive> 
    > select nvl('345',123);
OK
_c0
345
Time taken: 0.067 seconds, Fetched: 1 row(s)

4.2 isnull/isnotnull 函數(shù)

isnull( a ) 當(dāng)a為null的時(shí)候返回true,否則返回false
isnotnull ( a ) 當(dāng)a為null的時(shí)候返回false,否則返回true

測(cè)試記錄:

hive> 
    > select isnull(1);
OK
_c0
false
Time taken: 0.083 seconds, Fetched: 1 row(s)
hive> select isnull(null);
OK
_c0
true
Time taken: 0.072 seconds, Fetched: 1 row(s)
hive> select isnull('');
OK
_c0
false
Time taken: 0.076 seconds, Fetched: 1 row(s)
hive> 
    > select isnull(' ');
OK
_c0
false
Time taken: 0.08 seconds, Fetched: 1 row(s)
hive> 
    > 
    > select isnotnull(1);
OK
_c0
true
Time taken: 0.081 seconds, Fetched: 1 row(s)
hive> 
    > select isnotnull(null);
OK
_c0
false
Time taken: 0.065 seconds, Fetched: 1 row(s)
hive> 
    > select isnotnull('');
OK
_c0
true
Time taken: 0.071 seconds, Fetched: 1 row(s)
hive> 
    > select isnotnull(' ');
OK
_c0
true
Time taken: 0.086 seconds, Fetched: 1 row(s)
hive> 

4.3 if函數(shù)

if(boolean testCondition, T valueTrue, T valueFalseOrNull) 當(dāng)testCondition為true 返回 valueTrue,否則返回 valueFalseOrNull)

測(cè)試記錄:

hive> 
    > select if(1>2,'abc','def');
OK
_c0
def
Time taken: 0.134 seconds, Fetched: 1 row(s)
hive> 
    > select if(1<2,'abc','def');
OK
_c0
abc
Time taken: 0.1 seconds, Fetched: 1 row(s)

4.4 COALESCE函數(shù)

COALESCE(T v1, T v2, ...) 返回第一個(gè)不為null,當(dāng)所有的都是null,就返回null

測(cè)試記錄:

hive> 
    > select coalesce('a','b',null);
OK
_c0
a
Time taken: 0.081 seconds, Fetched: 1 row(s)
hive> 
    > select coalesce('null','b','c');
OK
_c0
null
Time taken: 0.073 seconds, Fetched: 1 row(s)
hive> 
    > select coalesce('null',null,null);
OK
_c0
null
Time taken: 0.073 seconds, Fetched: 1 row(s)

4.5 CASE 函數(shù)

CASE WHEN a THEN b [WHEN c THEN d]* [ELSE e] END case條件判斷語句
CASE a WHEN b THEN c [WHEN d THEN e]* [ELSE f] END case值判斷語句

代碼:

-- 值判斷,根據(jù)職位英文名判斷職位中文名
select e.empno,
     e.job,
     case when e.job = 'SALESMAN' then '銷售員'
          when e.job = 'MANAGER' then '管理人員'
          else '未知' end  as job_china
from emp e;

-- 條件判斷,小于某個(gè)時(shí)間點(diǎn)入職的是老員工,大于等于的是新員工
select e.empno,
     e.hiredate,
     case e.job when  'SALESMAN' then '銷售員'
                when  'MANAGER' then '管理人員'
                 else '未知' 
     end  as job_china
from emp e;

-- 沒有else語句且判斷不成功的時(shí)候,會(huì)返回為空
SELECT CASE 'c'
    WHEN 'a' THEN 1 WHEN 'b' THEN 2 END is_c;

測(cè)試記錄:

hive> 
    > -- 值判斷,根據(jù)職位英文名判斷職位中文名
    > select e.empno,
    >      e.job,
    >      case when e.job = 'SALESMAN' then '銷售員'
    >           when e.job = 'MANAGER' then '管理人員'
    >           else '未知' end  as job_china
    > from emp e;
Query ID = root_20201209163913_c6290550-dfa0-4f82-891e-24673ce9c704
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0142, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0142/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0142
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0
2020-12-09 16:39:21,669 Stage-1 map = 0%,  reduce = 0%
2020-12-09 16:39:27,895 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.68 sec
MapReduce Total cumulative CPU time: 6 seconds 680 msec
Ended Job = job_1606698967173_0142
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 2   Cumulative CPU: 6.68 sec   HDFS Read: 12384 HDFS Write: 764 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 680 msec
OK
e.empno e.job   job_china
7499    SALESMAN        銷售員
7876    CLERK   未知
7934    CLERK   未知
7521    SALESMAN        銷售員
7654    SALESMAN        銷售員
7698    MANAGER 管理人員
7788    ANALYST 未知
7369    CLERK   未知
7900    CLERK   未知
7902    ANALYST 未知
7566    MANAGER 管理人員
7782    MANAGER 管理人員
7839    PRESIDENT       未知
7844    SALESMAN        銷售員
Time taken: 15.553 seconds, Fetched: 14 row(s)
hive> 
    > -- 條件判斷,小于某個(gè)時(shí)間點(diǎn)入職的是老員工,大于等于的是新員工
    > select e.empno,
    >      e.hiredate,
    >      case e.job when  'SALESMAN' then '銷售員'
    >                 when  'MANAGER' then '管理人員'
    >                  else '未知' 
    >      end  as job_china
    > from emp e;
Query ID = root_20201209164150_7030d7f3-686c-4bb8-85e7-4f4e1db2dfef
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0143, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0143/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0143
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0
2020-12-09 16:41:57,812 Stage-1 map = 0%,  reduce = 0%
2020-12-09 16:42:05,052 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 6.23 sec
MapReduce Total cumulative CPU time: 6 seconds 230 msec
Ended Job = job_1606698967173_0143
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 2   Cumulative CPU: 6.23 sec   HDFS Read: 12252 HDFS Write: 808 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 230 msec
OK
e.empno e.hiredate      job_china
7499    1981-02-20      銷售員
7876    1987-06-13      未知
7934    1982-01-23      未知
7521    1981-02-22      銷售員
7654    1981-09-28      銷售員
7698    1981-05-01      管理人員
7788    1987-06-13      未知
7369    1980-12-17      未知
7900    1981-12-03      未知
7902    1981-12-03      未知
7566    1981-04-02      管理人員
7782    1981-06-09      管理人員
7839    1981-11-17      未知
7844    1981-09-08      銷售員
Time taken: 15.383 seconds, Fetched: 14 row(s)
hive> -- 沒有else語句且判斷不成功的時(shí)候,會(huì)返回為空
hive> SELECT CASE 'c'
    >     WHEN 'a' THEN 1 WHEN 'b' THEN 2 END is_c;
OK
is_c
NULL
Time taken: 0.082 seconds, Fetched: 1 row(s)
hive> 

五.字符函數(shù)

下列是Hive常見的字符函數(shù):

函數(shù)名 描述 返回值類型
ascii(string str) 返回字符的ascii值 int
chr(bigint or double A) 通過ascii值返回對(duì)應(yīng)字符 string
concat(string or binary A, string or binary B...) 字符連接函數(shù) string
concat_ws(string SEP, string A, string B...) 字符連接函數(shù),可以加連接符 string
decode(binary bin, string charset) 將二進(jìn)制文件轉(zhuǎn)字符集 string
elt(N int,str1 string,str2 string,str3 string,...) 返回指定的字符 string
field(val T,val1 T,val2 T,val3 T,...) 返回字符在參數(shù)中出現(xiàn)的位置 int
find_in_set(string str, string strList) 返回字符在字符串中出現(xiàn)的位置 int
instr(string str, string substr) 返回substr在str中出現(xiàn)的位置 int
length(string A) 字符串長(zhǎng)度長(zhǎng)度函數(shù) int
locate(string substr, string str[, int pos]) 返回在位置pos后的str中substr第一次出現(xiàn)的位置。 int
lower(string A) lcase(string A) 將字符轉(zhuǎn)為小寫 string
lpad(string str, int len, string pad) 字符填充函數(shù),向左填充 string
ltrim(string A) 截?cái)嘧址筮叺目崭?/td> string
repeat(string str, int n) 字符str重復(fù)n次 string
replace(string A, string OLD, string NEW) 字符串替換函數(shù) string
reverse(string A) 字符反轉(zhuǎn)函數(shù) string
rpad(string str, int len, string pad) 字符填充函數(shù),從右邊填充 string
rtrim(string A) 截?cái)嘧址疫叺目崭?/td> string
space(int n) 空格函數(shù),返回n個(gè)空格 string
substr(string or binary A, int start, int len)
substring(string or binary A, int start, int len)
字符截取函數(shù) string
translate(string or char or varchar input, string or char or varchar from, string or char or varchar to) 同postgresql的translate函數(shù) string
trim(string A) 截?cái)嘧址筮吅陀疫叺目崭?/td> string
upper(string A) ucase(string A) 字符串大寫函數(shù) string

5.1 ascii值函數(shù)

ascii(string str) 返回字符的ascii值
chr(bigint or double A) 通過ascii值返回對(duì)應(yīng)字符

這兩個(gè)函數(shù)剛好是相對(duì)應(yīng)的。

測(cè)試記錄:

hive> 
    > select ascii('a') ascii_a,ascii('A') ascii_a;
OK
ascii_a ascii_a
97      65
Time taken: 0.104 seconds, Fetched: 1 row(s)
hive> 
    > select chr(97) chr_97,chr(65) chr_65;
OK
chr_97  chr_65
a       A
Time taken: 0.075 seconds, Fetched: 1 row(s)

5.2 字符連接函數(shù)

concat(string or binary A, string or binary B...) 字符連接函數(shù)
concat_ws(string SEP, string A, string B...) 字符連接函數(shù),可以加連接符

測(cè)試記錄:

hive> 
    > 
    > select concat('A','b','DEF');
OK
_c0
AbDEF
Time taken: 0.214 seconds, Fetched: 1 row(s)
hive> 
    > select concat_ws('-','A','b','DEF');
OK
_c0
A-b-DEF
Time taken: 0.063 seconds, Fetched: 1 row(s)
hive> 

5.3 大小寫函數(shù)

lower(string A) lcase(string A) 將字符轉(zhuǎn)為小寫
upper(string A) ucase(string A) 字符串大寫函數(shù)

測(cè)試記錄:

hive> 
    > select lower('AbCdE'),upper('AbCdE');
OK
_c0     _c1
abcde   ABCDE
Time taken: 0.085 seconds, Fetched: 1 row(s)

5.4 字符填充函數(shù)

lpad(string str, int len, string pad) 字符填充函數(shù),向左填充
rpad(string str, int len, string pad) 字符填充函數(shù),從右邊填充

測(cè)試記錄:

hive> 
    > SELECT LPAD('hi',4,'??');
OK
_c0
??hi
Time taken: 0.065 seconds, Fetched: 1 row(s)
hive> SELECT LPAD('hi',1,'??');
OK
_c0
h
Time taken: 0.072 seconds, Fetched: 1 row(s)
hive> SELECT RPAD('hi',4,'??');
OK
_c0
hi??
Time taken: 0.064 seconds, Fetched: 1 row(s)
hive> SELECT RPAD('hi',1,'??');
OK
_c0
h
Time taken: 0.08 seconds, Fetched: 1 row(s)

5.5 空格函數(shù)

LTRIM() 左邊去掉空格
TRIM() 去掉空格
RTRIM() 右邊去掉空格
SPACE() 返回指定長(zhǎng)度的空格

如果中間有空格需要去掉,可以使用replace

代碼:

-- 去掉左邊、兩端、右邊的空格
select ltrim(' abc def '),trim(' abc def '),rtrim(' abc def ');
-- 返回10個(gè)空格
select SPACE(10);

測(cè)試記錄:

hive> select ltrim(' abc def '),trim(' abc def '),rtrim(' abc def ');
OK
_c0     _c1     _c2
abc def         abc def  abc def
Time taken: 0.077 seconds, Fetched: 1 row(s)
hive> select SPACE(10);
OK
_c0
          
Time taken: 0.062 seconds, Fetched: 1 row(s)

5.6 字符長(zhǎng)度函數(shù)

length(string A) 字符串長(zhǎng)度長(zhǎng)度函數(shù)

測(cè)試記錄:

hive> 
    > select length('aaaaaabbbbcccc');
OK
_c0
14
Time taken: 0.081 seconds, Fetched: 1 row(s)

5.7 字符截取函數(shù)

SUBSTR() 截取字符
SUBSTRING_INDEX() 截取字符
INSTR() 字符出現(xiàn)位置函數(shù)

語法:
substr(string or binary A, int start, int len)
substring(string or binary A, int start, int len)
INSTR(str,substr)
locate(string substr, string str[, int pos])

代碼:

-- 字符截取,從第5位開始截,len沒有值,表示截取到最后
select substr('abcdefghi',5);

-- 字符截取,從第5位開始截,截取2個(gè)字符
select substr('abcdefghi',5,2);

-- 字符截取,從第-5位開始截,截取2個(gè)字符
select substr('abcdefghi',-5,2);



-- '.'第一次出現(xiàn)的地方,保留左邊的
select substring_index('www.mysql.com','.',1);

-- '.'最后一次出現(xiàn)的地方,保留右邊的
select substring_index('www.mysql.com','.',-1);

-- '.'出現(xiàn)的位置
select instr('www.mysql.com','.');

-- '.'在第6個(gè)字符之后第一次出現(xiàn)的位置
select locate('.','hive.apache.org',6);

測(cè)試記錄:

hive> 
    > 
    > select substr('abcdefghi',5);
OK
_c0
efghi
Time taken: 0.091 seconds, Fetched: 1 row(s)
hive> select substr('abcdefghi',5,2);
OK
_c0
ef
Time taken: 0.066 seconds, Fetched: 1 row(s)
hive> select substr('abcdefghi',-5,2);
OK
_c0
ef
Time taken: 0.064 seconds, Fetched: 1 row(s)
hive> select substring_index('www.mysql.com','.',1);
OK
_c0
www
Time taken: 0.062 seconds, Fetched: 1 row(s)
hive> select substring_index('www.mysql.com','.',-1);
OK
_c0
com
Time taken: 0.062 seconds, Fetched: 1 row(s)
hive> select instr('www.mysql.com','.');
OK
_c0
4
Time taken: 0.062 seconds, Fetched: 1 row(s)
hive> select locate('.','hive.apache.org',6);
OK
_c0
12
Time taken: 0.067 seconds, Fetched: 1 row(s)
hive> 

5.8 字符替換函數(shù)

replace(string A, string OLD, string NEW) 字符串替換函數(shù)
translate(string or char or varchar input, string or char or varchar from, string or char or varchar to) 同postgresql的translate函數(shù)

replace 會(huì)把string OLD當(dāng)做一個(gè)整體來進(jìn)行替換
translate 會(huì)把string or char or varchar from拆分成一個(gè)一個(gè)的字符,與string to一個(gè)一個(gè)對(duì)應(yīng)進(jìn)行替換

translate比replace功能更強(qiáng)大
translate(‘123abc’,‘2dc’,‘4e’): 將會(huì)把2->4,d->e,c-’’
translate(‘123abc’,‘2’,‘4e’): 將會(huì)把2->4

測(cè)試記錄:

hive> 
    > SELECT REPLACE('acdd','cd','ef');
OK
_c0
aefd
Time taken: 0.093 seconds, Fetched: 1 row(s)
hive> SELECT TRANSLATE('acdd','cd','ef');
OK
_c0
aeff
Time taken: 0.072 seconds, Fetched: 1 row(s)
hive> select translate('123abc','2dc','4e');
OK
_c0
143ab
Time taken: 0.073 seconds, Fetched: 1 row(s)
hive> select translate('123abc','2','4e');
OK
_c0
143abc
Time taken: 0.075 seconds, Fetched: 1 row(s)
hive> 

5.9 其它函數(shù)

decode(binary bin, string charset) 將二進(jìn)制文件轉(zhuǎn)字符集
elt(N int,str1 string,str2 string,str3 string,...) 返回指定的字符
field(val T,val1 T,val2 T,val3 T,...) 返回字符在參數(shù)中出現(xiàn)的位置
repeat(string str, int n) 字符str重復(fù)n次
reverse(string A) 字符反轉(zhuǎn)函數(shù)

從Oracle開發(fā)轉(zhuǎn)過來的同學(xué)看到decode這個(gè)函數(shù)會(huì)覺得特別親切,不過還是得把Hive的decode語法看一下,這兩者完全不是一回事,這個(gè)decode是轉(zhuǎn)字符集的。

測(cè)試記錄:

hive> 
    > select reflect('java.net.URLDecoder', 'decode','1' , "GBK");
Query ID = root_20201209172130_cafe84ad-3e66-4e4a-92a0-3868efd52957
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0145, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0145/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0145
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-09 17:21:39,319 Stage-1 map = 0%,  reduce = 0%
2020-12-09 17:21:45,570 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.88 sec
MapReduce Total cumulative CPU time: 2 seconds 880 msec
Ended Job = job_1606698967173_0145
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.88 sec   HDFS Read: 4726 HDFS Write: 101 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 880 msec
OK
_c0
1
Time taken: 15.797 seconds, Fetched: 1 row(s)
hive> 
    > select elt(3,'abc','aaa','def','test');
OK
_c0
def
Time taken: 0.084 seconds, Fetched: 1 row(s)
hive> 
    > select field('ab','abc','ba','ab','dab');
OK
_c0
3
hive> 
    > 
    > select repeat('abc',10);
OK
_c0
abcabcabcabcabcabcabcabcabcabc
Time taken: 0.085 seconds, Fetched: 1 row(s)
hive> 
    > select reverse('abc');
OK
_c0
cba
Time taken: 0.078 seconds, Fetched: 1 row(s)

六.UDTF

UDTF 是表生成函數(shù)

這點(diǎn)與關(guān)系型數(shù)據(jù)庫的區(qū)別比較大,下面通過幾個(gè)例子來看看

代碼:

-- explode (array)
select explode(array('A','B','C'));
select explode(array('A','B','C')) as col;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
-- explode (map)
select explode(map('A',10,'B',20,'C',30));
select explode(map('A',10,'B',20,'C',30)) as (key,value);
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf as key,value;

測(cè)試記錄:

hive> 
    > 
    > 
    > select explode(array('A','B','C'));
Query ID = root_20201210161223_3527486d-76e5-4306-b3aa-8521f89774d9
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0150, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0150/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0150
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:12:30,620 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:12:37,839 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.92 sec
MapReduce Total cumulative CPU time: 2 seconds 920 msec
Ended Job = job_1606698967173_0150
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.92 sec   HDFS Read: 5007 HDFS Write: 129 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 920 msec
OK
col
A
B
C
Time taken: 15.401 seconds, Fetched: 3 row(s)
hive> select explode(array('A','B','C')) as col;
Query ID = root_20201210161249_acc85662-e686-4176-a5e8-032ed0992a45
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0151, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0151/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0151
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:12:56,243 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:13:02,434 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.8 sec
MapReduce Total cumulative CPU time: 2 seconds 800 msec
Ended Job = job_1606698967173_0151
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.8 sec   HDFS Read: 5007 HDFS Write: 129 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 800 msec
OK
col
A
B
C
Time taken: 14.35 seconds, Fetched: 3 row(s)
hive> select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf;
Query ID = root_20201210161305_5012fe10-4599-4ffe-838f-b867d212a10a
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0152, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0152/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0152
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:13:14,529 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:13:20,716 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.86 sec
MapReduce Total cumulative CPU time: 2 seconds 860 msec
Ended Job = job_1606698967173_0152
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.86 sec   HDFS Read: 5764 HDFS Write: 129 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 860 msec
OK
tf.col
A
B
C
Time taken: 16.663 seconds, Fetched: 3 row(s)
hive> select tf.* from (select 0) t lateral view explode(array('A','B','C')) tf as col;
Query ID = root_20201210161324_8f1bcaa5-931d-49c1-8dbd-5da936bfa67f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0153, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0153/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0153
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:13:32,968 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:13:40,179 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.99 sec
MapReduce Total cumulative CPU time: 2 seconds 990 msec
Ended Job = job_1606698967173_0153
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.99 sec   HDFS Read: 5778 HDFS Write: 129 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 990 msec
OK
tf.col
A
B
C
Time taken: 17.102 seconds, Fetched: 3 row(s)
hive> select explode(map('A',10,'B',20,'C',30));
Query ID = root_20201210161354_1c522046-cac8-4182-9e28-9789a17c1a98
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0154, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0154/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0154
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:14:00,978 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:14:08,199 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.96 sec
MapReduce Total cumulative CPU time: 2 seconds 960 msec
Ended Job = job_1606698967173_0154
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.96 sec   HDFS Read: 5144 HDFS Write: 138 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 960 msec
OK
key     value
A       10
B       20
C       30
Time taken: 15.114 seconds, Fetched: 3 row(s)
hive> select explode(map('A',10,'B',20,'C',30)) as (key,value);
Query ID = root_20201210161411_59b615ac-2733-40c2-bec4-1a17f593140f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0155, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0155/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0155
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:14:19,606 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:14:25,805 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.95 sec
MapReduce Total cumulative CPU time: 2 seconds 950 msec
Ended Job = job_1606698967173_0155
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.95 sec   HDFS Read: 5144 HDFS Write: 138 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 950 msec
OK
key     value
A       10
B       20
C       30
Time taken: 15.767 seconds, Fetched: 3 row(s)
hive> select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
Query ID = root_20201210161428_5b45482b-39c0-4a6f-9e67-ec640e03d816
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0156, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0156/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0156
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:14:38,792 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:14:46,098 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.92 sec
MapReduce Total cumulative CPU time: 2 seconds 920 msec
Ended Job = job_1606698967173_0156
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.92 sec   HDFS Read: 5987 HDFS Write: 138 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 920 msec
OK
tf.key  tf.value
A       10
B       20
C       30
Time taken: 18.456 seconds, Fetched: 3 row(s)
hive> select tf.* from (select 0) t lateral view explode(map('A',10,'B',20,'C',30)) tf;
Query ID = root_20201210161448_4ebe123f-57b7-4869-9e0e-557cccb27a10
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1606698967173_0157, Tracking URL = http://hp1:8088/proxy/application_1606698967173_0157/
Kill Command = /opt/cloudera/parcels/CDH-6.3.1-1.cdh6.3.1.p0.1470567/lib/hadoop/bin/hadoop job  -kill job_1606698967173_0157
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2020-12-10 16:14:57,112 Stage-1 map = 0%,  reduce = 0%
2020-12-10 16:15:03,299 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.74 sec
MapReduce Total cumulative CPU time: 2 seconds 740 msec
Ended Job = job_1606698967173_0157
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1   Cumulative CPU: 2.74 sec   HDFS Read: 5987 HDFS Write: 138 HDFS EC Read: 0 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 740 msec
OK
tf.key  tf.value
A       10
B       20
C       30
Time taken: 16.725 seconds, Fetched: 3 row(s)
hive> 

參考

1.https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-MathematicalFunctions

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容