clickhouse distributed query 分布式查詢

記一次使用clickhouse遇到的問題

聯查兩張表,graphite_d?為分布式表, user?為普通表

? ?select count(*) from default.graphite_d where member in( select member from user where date='2019-04-26')

報錯:

message: Received from XXX.XXX.XXX.XXX:9000. DB::Exception: Table default.user doesn't exist..?并且會有調用棧打印出來

查詢文檔發(fā)現,當?graphite_d?為分布式表時,將會把如下查詢發(fā)送到所有遠程服務器

?select count(*) from default.graphite where member in?( select member from user where date='2019-04-26')

但是?因為user?表是普通表,只存在于一臺機器上,所以其他的遠程服務器在本地是無法查詢到此表的

因此 user?需要成為分布式表

在每臺服務器上創(chuàng)建 user的分布式表 user_d

CREATE TABLEuser_d (member Int64)

ENGINE = Distributed(clusterName,defaultDB, user, rand())

運行sql?如下:

?select count(*) from default.graphite_d where member in( select member from user_d where date='2019-04-26')

報錯:

Received exception from server:

Code: 288. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Double-distributed IN/JOIN subqueries is denied (distributed_product_mode = 'deny'). You may rewrite query to use local tables in subqueries, or use GLOBAL keyword, or set distributed_product_mode to suitable value..

根據提示以及查詢clickhouse的官方文檔,解決方案如下

sql 語句改為:

select count(*) from default.graphite_d where member?global?in( select member from?user_d?where date='2019-04-26')

或者select count(*) from default.graphite_d where member? in( select member from?default.user?where date='2019-04-26')

參考文檔:https://clickhouse.yandex/docs/zh/query_language/select/#select-distributed-subqueries(中文)

https://clickhouse.yandex/docs/en/query_language/select/#select-distributed-subqueries(英文)

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
【社區(qū)內容提示】社區(qū)部分內容疑似由AI輔助生成,瀏覽時請結合常識與多方信息審慎甄別。
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發(fā)布,文章內容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務。

友情鏈接更多精彩內容