記一次使用clickhouse遇到的問題
聯查兩張表,graphite_d?為分布式表, user?為普通表
? ?select count(*) from default.graphite_d where member in( select member from user where date='2019-04-26')
報錯:
message: Received from XXX.XXX.XXX.XXX:9000. DB::Exception: Table default.user doesn't exist..?并且會有調用棧打印出來
查詢文檔發(fā)現,當?graphite_d?為分布式表時,將會把如下查詢發(fā)送到所有遠程服務器
?select count(*) from default.graphite where member in?( select member from user where date='2019-04-26')
但是?因為user?表是普通表,只存在于一臺機器上,所以其他的遠程服務器在本地是無法查詢到此表的
因此 user?需要成為分布式表
在每臺服務器上創(chuàng)建 user的分布式表 user_d
CREATE TABLEuser_d (member Int64)
ENGINE = Distributed(clusterName,defaultDB, user, rand())
運行sql?如下:
?select count(*) from default.graphite_d where member in( select member from user_d where date='2019-04-26')
報錯:
Received exception from server:
Code: 288. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Double-distributed IN/JOIN subqueries is denied (distributed_product_mode = 'deny'). You may rewrite query to use local tables in subqueries, or use GLOBAL keyword, or set distributed_product_mode to suitable value..
根據提示以及查詢clickhouse的官方文檔,解決方案如下
sql 語句改為:
select count(*) from default.graphite_d where member?global?in( select member from?user_d?where date='2019-04-26')
或者select count(*) from default.graphite_d where member? in( select member from?default.user?where date='2019-04-26')
參考文檔:https://clickhouse.yandex/docs/zh/query_language/select/#select-distributed-subqueries(中文)
https://clickhouse.yandex/docs/en/query_language/select/#select-distributed-subqueries(英文)