HIVE: lateral view explode & json_turpe 實現(xiàn) json數(shù)組行轉(zhuǎn)列&字段拆分

問題描述

有時候因為業(yè)務(wù)的需要,有些字段不但是json格式,并且還是個json數(shù)組,比如下表 pay_infos:

pay_id infos
1111 [{"uid":123,"terminalFrom":0,"couponBatchId":1410115799,"cost":5},{"uid":123,"terminalFrom":0,"couponBatchId":1410116199,,"cost":7,}]
1112 [{"uid":124,"terminalFrom":1,"couponBatchId":1410115799,"cost":20}]

為了統(tǒng)計的需要,需要解析出每個json的具體key和value,比如上表,我需要求出,所有couponBatchId分類的cost的值。

解決辦法

  • explode

對于infos字段,首先應(yīng)該解決數(shù)組的行轉(zhuǎn)列問題,這個交給explode函數(shù)

select 
  explode(split(substring(infos,3,length(coupon_enties)-4),'\\},\\{'))  as entity
from pay_infos

通過explode一次處理infos按照單個json的字符串轉(zhuǎn)換成多行,如下

entity
"uid":123,"terminalFrom":0,"couponBatchId":1410115799,"cost":5
"uid":123,"terminalFrom":0,"couponBatchId":1410116199,,"cost":7
"uid":124,"terminalFrom":1,"couponBatchId":1410115799,"cost":20

如果需要保留其他字段信息,則使用lateral view

select pay_id,concat('{',entity,'}') as coupons
from
(       select * 
       from pay_infos
) a 
lateral view explode(split(substring(infos,3,length(infos)-4),'\\},\\{')) b as entity

結(jié)果如下:

pay_id coupons
1111 "uid":123,"terminalFrom":0,"couponBatchId":1410115799,"cost":5
1111 "uid":123,"terminalFrom":0,"couponBatchId":1410116199,,"cost":7
1112 "uid":124,"terminalFrom":1,"couponBatchId":1410115799,"cost":20
  • json_turpe

對于key:value格式的信息,可以通過拼成完成的json之后使用,json_turpe,也就是行轉(zhuǎn)列,比如上面輸出的表稱為source,栗子SQL如下:

select info.*
from 
(
  select concat('{',entity,'}') as coupons
  from source
  lateral view json_tuple(coupons,'couponBatchId','terminalFrom','cost','uid') info 
  as couponBatchId,terminalFrom,cost,uid
couponBatchId terminalFrom cost uid
1410115799 0 5 123
1410116199 0 7 123
1410115799 1 20 124

結(jié)果如下:

couponBatchId terminalFrom cost uid
1410115799 0 5 123
1410116199 0 7 123
1410115799 1 20 124

完整SQL

select info.*
from 
(
  (select 
  pay_id,concat('{',entity,'}') as coupons
  from
  (       
    select *  from  pay_infos
  ) a 
  lateral view explode(split(substring(infos,3,length(infos)-4),'\\},\\{')) b as entity) source
  lateral view json_tuple(coupons,'couponBatchId','terminalFrom','cost','uid') info 
  as couponBatchId,terminalFrom,cost,uid

說明

  • concat用于補出完整json
  • split時,分隔符需要轉(zhuǎn)義,如果sql寫在" "內(nèi)(如:shell腳本里調(diào)用hive -e "$sql"的情形),則需要4個' \ '轉(zhuǎn)義特殊字符,即:split(infos,'\\\\},\\\\{')
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容