一個(gè)華為手機(jī)上的Bug
今天查一個(gè) 輝光抖動(dòng) 的問題:我們一個(gè)PBR的摩托車,在開輝光后高光處閃爍的厲害,并且這個(gè)閃爍只出現(xiàn)在華為手機(jī)上(Mali GPU)。
用RenderDoc分析了一下,閃爍處的高光值已經(jīng)逆天了,如下圖:
由上圖可見,紅框標(biāo)記的顏色值達(dá)到了 65504,由于我們開啟了 FP16 HDR,這里的 65504 剛好是 FP16 能表示的最大值。
0 11110 1111111111=(-1)^0 * 2^15 * (1+1-2^-10)=65504
直覺上這里是 浮點(diǎn)數(shù)精度 的問題,因?yàn)橹皼]少吃 Mali GPU 的虧,:)
修正
要堵這個(gè)問題很簡(jiǎn)單,只需要對(duì)最終的高光值用 clamp大法 即可。
不過作為一個(gè)強(qiáng)迫癥患者,我還是想找到具體是哪里出了問題,于是做了一番調(diào)試,最后發(fā)現(xiàn)問題代碼如下:
half perceptualRoughness = SmoothnessToPerceptualRoughness(smoothness);
half roughness = PerceptualRoughnessToRoughness(perceptualRoughness);
half V = SmithJointGGXVisibilityTerm(NoL, NoV, roughness);
half D = GGXTerm(NoH, roughness);
half specularTerm = V * D * UNITY_PI;
這里 PBR 的高光項(xiàng)計(jì)算直接摘了Unity的 BRDF1 算法,去掉了 菲涅爾項(xiàng),上述代碼中 roughness 的 精度 影響了最終高光的計(jì)算結(jié)果。
我們看一下法線分布函數(shù) GGXTerm 的代碼:
inline float GGXTerm (float NdotH, float roughness)
{
float a2 = roughness * roughness;
float d = (NdotH * a2 - NdotH) * NdotH + 1.0f; // 2 mad
return UNITY_INV_PI * a2 / (d * d + 1e-7f);
// This function is not intended to be running on Mobile,
// therefore epsilon is smaller than what can be represented by half
}
參數(shù)都是 float,并且函數(shù)結(jié)尾有一個(gè)清楚的注釋,說這個(gè)函數(shù)沒打算在移動(dòng)設(shè)備上跑,因?yàn)檫@里 1e-7f 并沒考慮兼容 half 的精度:
This function is not intended to be running on Mobile, therefore epsilon is smaller than what can be represented by half
半精度浮點(diǎn)數(shù)能表示的最小值為 6.10×10^(-5):
0 00001 0000000000=2^-14 = 6.10*10^-5
所以把 roughness 的精度從 half 變成 float,這個(gè)問題也就修正了。
URP管線對(duì)BRDF的簡(jiǎn)化
在移動(dòng)設(shè)備直接用 Standard管線 的 BRDF1 算法,計(jì)算量會(huì)略高。
這里我們也可以參考 BRDF2 的寫法,或者參考 URP管線 對(duì)于 DirectBDRF 的簡(jiǎn)化方式,代碼如下:
// Based on Minimalist CookTorrance BRDF
// Implementation is slightly different from original derivation: http://www.thetenthplanet.de/archives/255
//
// * NDF [Modified] GGX
// * Modified Kelemen and Szirmay-Kalos for Visibility term
// * Fresnel approximated with 1/LdotH
half3 DirectBDRF(BRDFData brdfData, half3 normalWS, half3 lightDirectionWS, half3 viewDirectionWS)
{
#ifndef _SPECULARHIGHLIGHTS_OFF
float3 halfDir = SafeNormalize(float3(lightDirectionWS) + float3(viewDirectionWS));
float NoH = saturate(dot(normalWS, halfDir));
half LoH = saturate(dot(lightDirectionWS, halfDir));
// GGX Distribution multiplied by combined approximation of Visibility and Fresnel
// BRDFspec = (D * V * F) / 4.0
// D = roughness^2 / ( NoH^2 * (roughness^2 - 1) + 1 )^2
// V * F = 1.0 / ( LoH^2 * (roughness + 0.5) )
// See "Optimizing PBR for Mobile" from Siggraph 2015 moving mobile graphics course
// https://community.arm.com/events/1155
// Final BRDFspec = roughness^2 / ( NoH^2 * (roughness^2 - 1) + 1 )^2 * (LoH^2 * (roughness + 0.5) * 4.0)
// We further optimize a few light invariant terms
// brdfData.normalizationTerm = (roughness + 0.5) * 4.0 rewritten as roughness * 4.0 + 2.0 to a fit a MAD.
float d = NoH * NoH * brdfData.roughness2MinusOne + 1.00001f;
half LoH2 = LoH * LoH;
half specularTerm = brdfData.roughness2 / ((d * d) * max(0.1h, LoH2) * brdfData.normalizationTerm);
// On platforms where half actually means something, the denominator has a risk of overflow
// clamp below was added specifically to "fix" that, but dx compiler (we convert bytecode to metal/gles)
// sees that specularTerm have only non-negative terms, so it skips max(0,..) in clamp (leaving only min(100,...))
#if defined (SHADER_API_MOBILE) || defined (SHADER_API_SWITCH)
specularTerm = specularTerm - HALF_MIN;
specularTerm = clamp(specularTerm, 0.0, 100.0); // Prevent FP16 overflow on mobiles
#endif
half3 color = specularTerm * brdfData.specular + brdfData.diffuse;
return color;
#else
return brdfData.diffuse;
#endif
}
代碼注釋寫得很清楚,簡(jiǎn)化方式參考了 SIGGRAPH 2015 之 Optimizing PBR for Mobile。
經(jīng)典的微表面高光 BRDF 公式如下:
按照 Optimizing PBR for Mobile 的方式,可以對(duì) V * F 合并和近似:
BRDFspec = (D * V * F) / 4.0
D = roughness^2 / ( NoH^2 * (roughness^2 - 1) + 1 )^2
V * F = 1.0 / ( LoH^2 * (roughness + 0.5) )
最終結(jié)果如下:
最后,上面的代碼也兼顧了 half 的精度:
#define HALF_MIN 6.103515625e-5 // 2^-14, the same value for 10, 11 and 16-bit: https://www.khronos.org/opengl/wiki/Small_Float_Formats
// On platforms where half actually means something, the denominator has a risk of overflow
// clamp below was added specifically to "fix" that, but dx compiler (we convert bytecode to metal/gles)
// sees that specularTerm have only non-negative terms, so it skips max(0,..) in clamp (leaving only min(100,...))
#if defined (SHADER_API_MOBILE) || defined (SHADER_API_SWITCH)
specularTerm = specularTerm - HALF_MIN;
specularTerm = clamp(specularTerm, 0.0, 100.0); // Prevent FP16 overflow on mobiles
#endif
個(gè)人主頁(yè)
本文的個(gè)人主頁(yè)鏈接:https://baddogzz.github.io/2020/04/27/Mali-Float-Presion/。
好了,拜拜!