BOSS:那個,白茶啊,這個報表刷新有點慢啊,你看,每次我點篩選或者刷新都會讓我等很久。
白茶:(認(rèn)真臉)BOSS,那您再等會就好!
......
BOSS:白茶!!兩小時了!!還沒出來?。?br> 白茶:(思考)老板,這個有點難啊,這個問題技術(shù)要求比較高。
BOSS:加錢?。。?!
白茶:好嘞!
一張好的報表是如何界定的?DAX計算無誤、前端展現(xiàn)明了、業(yè)務(wù)思路清晰、報表響應(yīng)速度,白茶覺得這些因素就可以界定一張好的報表。
本期我們來聊一聊PowerBI中DAX函數(shù)性能優(yōu)化的問題。
畢竟一張可視化報表需要15分鐘刷新才能呈現(xiàn)出來,這對用戶來說太不友好了。
先來看看本期的示例文件:
一張產(chǎn)品維度表,一張銷售明細(xì)表。
需求是什么?
這張是銷售明細(xì)表中的分店維度信息,為了便于小伙伴理解,白茶單獨整理出來。
這張表是需求的計算邏輯圖。什么意思呢?就是當(dāng)Key小于15時,計算每個Key對應(yīng)的分店,當(dāng)Key大于14時,根據(jù)計算邏輯對不同的分店進(jìn)行匯總計算。
編寫基礎(chǔ)的DAX計算代碼:
SalesAmount =
SUMX ( 'Fact_SalesDetail', [Quantity] * RELATED ( Dim_Product[SalesAmount] ) )
在不考慮性能的情況下,DAX計算邏輯如下:
SalesAmountByDisplay =
SUMX (
'Dim_DisplayDepartment',
SWITCH (
TRUE (),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 1,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 1 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 2,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 3,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 3 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 4,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 4 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 5,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 5 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 6,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 6 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 7,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 7 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 8,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 8 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 9,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 9 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 10,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 10 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 11,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 11 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 12,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 12 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 13,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 13 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 14,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 14 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 15,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 13 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 14 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 16,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 1 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 7 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 9 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 17,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 1 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 3 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 11 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 18,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 9 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 11 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 12 )
)
- CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 19,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 4 )
)
+ CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 7 )
),
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] ) = 20, CALCULATE ( [SalesAmount], ALLSELECTED ( Fact_SalesDetail ) )
)
)
相信不用白茶多說,小伙伴也能看出來代碼的問題,太長了。
這段代碼功能基本上是實現(xiàn)了,問題點有哪些呢?
1.SELECTEDVALUE復(fù)用度較高,可以使用變量代替
2.多個條件匯總迭代次數(shù)較多,可以使用提供List
這也是小伙伴常見的問題,如果DAX的構(gòu)建可以繞開這兩個問題,那么性能會有很大的提升。
簡化版寫法:
SalesAmountByDisplay2 =
VAR CurrentDepartmentKey =
SELECTEDVALUE ( Dim_DisplayDepartment[DepartmentKey] )
VAR CurrentDepartment =
TREATAS (
VALUES ( Dim_DisplayDepartment[DepartmentKey] ),
Fact_SalesDetail[DepartmentKey]
)
VAR Results =
SWITCH (
TRUE (),
CurrentDepartmentKey = 15,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 13, 14 } )
),
CurrentDepartmentKey = 16,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 1, 7, 9 } )
),
CurrentDepartmentKey = 17,
CALCULATE (
[SalesAmount],
FILTER (
'Fact_SalesDetail',
'Fact_SalesDetail'[DepartmentKey] IN { 1, 2, 3, 11 }
)
),
CurrentDepartmentKey = 18,
CALCULATE (
[SalesAmount],
FILTER (
'Fact_SalesDetail',
'Fact_SalesDetail'[DepartmentKey] IN { 9, 11, 12 }
)
)
- CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
),
CurrentDepartmentKey = 19,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 4, 7 } )
),
CurrentDepartmentKey = 20, CALCULATE ( [SalesAmount], ALLSELECTED ( Fact_SalesDetail ) ),
CALCULATE ( [SalesAmount], CurrentDepartment )
)
RETURN
Results
相較于之前的寫法,這個寫法通過VAR定義變量,和使用TREATAS來減少代碼計算邏輯的書寫。
通過提供List來減少迭代遍歷的次數(shù)。
那么有沒有繼續(xù)可以優(yōu)化的空間?有的。
優(yōu)化寫法:
SalesAmountByDisplay3 =
VAR CurrentDetail =
ADDCOLUMNS (
DISTINCT ( 'Dim_DisplayDepartment' ),
"@CurrentValue",
VAR CurrentDepartmentKey = 'Dim_DisplayDepartment'[DepartmentKey]
RETURN
SWITCH (
TRUE (),
CurrentDepartmentKey = 15,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 13, 14 } )
),
CurrentDepartmentKey = 16,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 1, 7, 9 } )
),
CurrentDepartmentKey = 17,
CALCULATE (
[SalesAmount],
FILTER (
'Fact_SalesDetail',
'Fact_SalesDetail'[DepartmentKey] IN { 1, 2, 3, 11 }
)
),
CurrentDepartmentKey = 18,
CALCULATE (
[SalesAmount],
FILTER (
'Fact_SalesDetail',
'Fact_SalesDetail'[DepartmentKey] IN { 9, 11, 12 }
)
)
- CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] = 2 )
),
CurrentDepartmentKey = 19,
CALCULATE (
[SalesAmount],
FILTER ( 'Fact_SalesDetail', 'Fact_SalesDetail'[DepartmentKey] IN { 4, 7 } )
),
CurrentDepartmentKey = 20, CALCULATE ( [SalesAmount], ALLSELECTED ( Fact_SalesDetail ) ),
CALCULATE (
[SalesAmount],
'Fact_SalesDetail'[DepartmentKey] = CurrentDepartmentKey
)
)
)
RETURN
SUMX ( CurrentDetail, [@CurrentValue] )
這種寫法,相較于上一種書寫量多一些,通過定義虛擬表來減少迭代遍歷的次數(shù)。從理論上來說,因為定義了虛擬表,無論源代碼需要對事實表迭代多少次,這個思路迭代的永遠(yuǎn)都是虛擬表,優(yōu)化度很高。
我們來對比一下:
DAX Studio測試:
三種寫法的差距很明顯。其實這里有一些爭議的,簡化寫法是通過減少資源占用來實現(xiàn)優(yōu)化,優(yōu)化寫法是通過減少迭代遍歷實現(xiàn)優(yōu)化。
從測試結(jié)果上來看,是簡化寫法優(yōu)化度較高,但是在實際應(yīng)用中,測試發(fā)現(xiàn)優(yōu)化寫法的方式響應(yīng)更迅速。
你以為到這里就結(jié)束了么?
其實還有第四種優(yōu)化的思路,只不過這個思路比較難。
SalesAmountByDisplay4 =
SUMX (
VALUES ( 'Dim_DisplayDepartment'[DepartmentKey] ),
VAR CurDpmKey = 'Dim_DisplayDepartment'[DepartmentKey]
VAR TempTable =
FILTER (
ALL ( 'Dim_DisplayDepartment'[DepartmentKey] ),
'Dim_DisplayDepartment'[DepartmentKey] <= 14
)
VAR AllDetail =
ADDCOLUMNS (
TempTable,
"SalesAmount",
VAR CurrentDepartment = 'Dim_DisplayDepartment'[DepartmentKey]
RETURN
CALCULATE (
[SalesAmount],
ALL ( 'Dim_DisplayDepartment' ),
'Fact_SalesDetail'[DepartmentKey] = CurrentDepartment
)
)
VAR FilterContent =
CALCULATE (
MAX ( 'Dim_ComputationalLogic'[FilterContent] ),
ALL ( Dim_DisplayDepartment ),
'Dim_ComputationalLogic'[DepartmentKey] = CurDpmKey
)
VAR Length =
LEN ( FilterContent )
VAR FilterTable =
SELECTCOLUMNS (
GENERATESERIES ( 1, Length, 4 ),
"DepmKey", MID ( FilterContent, [Value], 3 ) * 1
)
VAR Result =
SUMX (
FilterTable,
VAR DpmKey = [DepmKey]
VAR SalesValue =
SUMMARIZE (
FILTER ( AllDetail, 'Dim_DisplayDepartment'[DepartmentKey] = ABS ( DpmKey ) ),
[SalesAmount]
)
RETURN
IF ( DpmKey >= 0, SalesValue, - SalesValue )
)
RETURN
Result
)
--作者:夕楓
這個優(yōu)化的思路,是@夕楓大佬提出來的。通過定義計算表,減少代碼書寫量,使用ALL減少上下文轉(zhuǎn)換的消耗,減少查詢次數(shù),命中緩存。
DAX Studio測試:
總結(jié)一下:
1.可以通過變量和定義表來減少代碼書寫量
2.可以通過減少資源調(diào)用優(yōu)化
3.可以通過虛擬表減少迭代遍歷
4.可以通過命中緩存進(jìn)行優(yōu)化
比較常用的是前三種,第四種難度系數(shù)較高。
往期推薦:
小伙伴們?GET了么?
(白茶:別問我第四種,我不會TAT)
這里是白茶,一個PowerBI的初學(xué)者。
