Android9.0 硬件加速(五) -RenderThread渲染過(guò)程

原創(chuàng)文章,轉(zhuǎn)載注明出處,多謝合作。

經(jīng)過(guò)上篇繪制過(guò)程,應(yīng)用層已經(jīng)準(zhǔn)備好了DisplayList. 接下來(lái)就是渲染過(guò)程.Android硬件加速不同于軟件繪制, 它的渲染過(guò)程會(huì)單獨(dú)起一個(gè)native線程RenderThread來(lái)處理,而軟件繪制的繪制過(guò)程和渲染過(guò)程都是在UI Thread完成.

下面開(kāi)始繼續(xù)分析, 仍然回到ThreadedRenderer的draw:

frameworks/base/core/java/android/view/ThreadedRenderer.java

void draw(View view, AttachInfo attachInfo, DrawCallbacks callbacks,
FrameDrawingCallback frameDrawingCallback) {
...
    updateRootDisplayList(view, callbacks);//構(gòu)建DisplayList
...
   int syncResult = nSyncAndDrawFrame(mNativeProxy, frameInfo, frameInfo.length);//渲染視圖
...
}

上文把updateRootDisplayList部分已經(jīng)分析完了,下面接著看看nSyncAndDrawFrame, 它是一個(gè)Native方法,那么對(duì)應(yīng)到Native層:

frameworks/base/core/jni/android_view_ThreadedRenderer.cpp

static int android_view_ThreadedRenderer_syncAndDrawFrame(JNIEnv* env, jobject clazz,
       jlong proxyPtr, jlongArray frameInfo, jint frameInfoSize) {
   LOG_ALWAYS_FATAL_IF(frameInfoSize != UI_THREAD_FRAME_INFO_SIZE,
           "Mismatched size expectations, given %d expected %d",
           frameInfoSize, UI_THREAD_FRAME_INFO_SIZE);
   RenderProxy* proxy = reinterpret_cast<RenderProxy*>(proxyPtr);
   env->GetLongArrayRegion(frameInfo, 0, frameInfoSize, proxy->frameInfo());
   return proxy->syncAndDrawFrame();
}

方法中RenderProxy執(zhí)行它的syncAndDrawFrame方法

frameworks/base/libs/hwui/renderthread/RenderProxy.cpp

int RenderProxy::syncAndDrawFrame() {
   return mDrawFrameTask.drawFrame();
}

起了一個(gè)Task執(zhí)行drawFrame操作

frameworks/base/libs/hwui/renderthread/DrawFrameTask.cpp

int DrawFrameTask::drawFrame() {
   LOG_ALWAYS_FATAL_IF(!mContext, "Cannot drawFrame with no CanvasContext!");
   mSyncResult = SyncResult::OK;
   mSyncQueued = systemTime(CLOCK_MONOTONIC);
   postAndWait();
   return mSyncResult;
}
void DrawFrameTask::postAndWait() {
   AutoMutex _lock(mLock);
   mRenderThread->queue().post([this]() { run(); });
   mSignal.wait(mLock);
}

RenderThread是一個(gè)大loop,繪制操作都以RenderTask的形式post到RenderThread中完成。那么對(duì)應(yīng)的run方法內(nèi)就是

渲染的核心邏輯:

void DrawFrameTask::run() { 
   ATRACE_NAME("DrawFrame");// 對(duì)應(yīng)systrace中的 DrawFrame label 
   bool canUnblockUiThread; 
   bool canDrawThisFrame; 
   { 
       TreeInfo info(TreeInfo::MODE_FULL, *mContext); 
       canUnblockUiThread = syncFrameState(info);//同步視圖數(shù)據(jù) 
       canDrawThisFrame = info.out.canDrawThisFrame; 
  ...
   / /繪制提交openGl命令到GPU
   if (CC_LIKELY(canDrawThisFrame)) { 
       context->draw();//CanvasContext繪制 
   } else { 
       // wait on fences so tasks don't overlap next frame 
       context->waitOnFences(); 
   } 
 ... 
}

將DrawFrameTask插入RenderThread,并且阻塞等待RenderThread跟UI線程同步應(yīng)用繪制階段封裝好的數(shù)據(jù),如果同步成功,則UI線程喚醒,否則UI線程一直處于阻塞等待狀態(tài)。同步結(jié)束后RenderThread才會(huì)開(kāi)始處理GPU渲染相關(guān)工作.

所以一個(gè)DrawFrameTask操作主要分為兩個(gè)部分:
1)syncFrameState 將主線程的UI數(shù)據(jù)同步到Render線程
2)CanvasContext.draw 繪制

先看syncFrameState:

bool DrawFrameTask::syncFrameState(TreeInfo& info) { 
    ATRACE_CALL(); // 對(duì)應(yīng)systraced的syncFrameState 標(biāo)簽 
  ...
   / /同步DisplayListOp tree
    mContext->prepareTree(info, mFrameInfo, mSyncQueued, mTargetNode); 
  ... 
   return info.prepareTextures; 
}

主要的同步過(guò)程是在CanvasContext的prepareTree中

frameworks/base/libs/hwui/renderthread/CanvasContext.cpp 

void CanvasContext::prepareTree(TreeInfo& info, int64_t* uiFrameInfo, 
        int64_t syncQueued, RenderNode* target) { 
  ... 
  mCurrentFrameInfo->importUiThreadInfo(uiFrameInfo); / /memcpy方式拷貝數(shù)據(jù)
  mCurrentFrameInfo->set(FrameInfoIndex::SyncQueued) = syncQueued; 
  mCurrentFrameInfo->markSyncStart(); 
  ...
mRenderPipeline->onPrepareTree();
for (const sp<RenderNode>& node : mRenderNodes) {
info.mode = (node.get() == target ? TreeInfo::MODE_FULL : TreeInfo::MODE_RT_ONLY);
   //mRootRenderNode遞歸遍歷所有節(jié)點(diǎn)執(zhí)行prepareTree
   node->prepareTree(info);
   GL_CHECKPOINT(MODERATE);
}
...
}

這個(gè)過(guò)程簡(jiǎn)單說(shuō)就是將應(yīng)用層之前準(zhǔn)備好的DisplayListOp集完整同步到Native的 RenderThread上來(lái).

frameworks/base/libs/hwui/RenderNode.cpp

void RenderNode::prepareTree(TreeInfo& info) {
   ATRACE_CALL();
   LOG_ALWAYS_FATAL_IF(!info.damageAccumulator, "DamageAccumulator missing");
   MarkAndSweepRemoved observer(&info);
   // The OpenGL renderer reserves the stencil buffer for overdraw debugging. Functors
// will need to be drawn in a layer.
bool functorsNeedLayer = Properties::debugOverdraw && !Properties::isSkiaEnabled();
   prepareTreeImpl(observer, info, functorsNeedLayer);
}
void RenderNode::prepareTreeImpl(TreeObserver& observer, TreeInfo& info, bool functorsNeedLayer) {
   info.damageAccumulator->pushTransform(this);
   if (info.mode == TreeInfo::MODE_FULL) {
       pushStagingPropertiesChanges(info);//同步屬性
   }
   ...
   prepareLayer(info, animatorDirtyMask);
   if (info.mode == TreeInfo::MODE_FULL) {
       pushStagingDisplayListChanges(observer, info);//同步DIsplayListOp
   }
   if (mDisplayList) {
       info.out.hasFunctors |= mDisplayList->hasFunctor();
       bool isDirty = mDisplayList->prepareListAndChildren(
               observer, info, childFunctorsNeedLayer,
               [](RenderNode* child, TreeObserver& observer, TreeInfo& info,
                  bool functorsNeedLayer) {
                   child->prepareTreeImpl(observer, info, functorsNeedLayer);//遞歸執(zhí)行
               });
       if (isDirty) {
           damageSelf(info);
       }
   }
   pushLayerUpdate(info);
   info.damageAccumulator->popTransform();
}

其中pushStagingDisplayListChanges是將setStagingDisplayList暫存的DisplayList賦值到RenderNode的mDisplayListData.

RenderNode::prepareTree()會(huì)遍歷DisplayList的樹(shù)形結(jié)構(gòu),對(duì)于子節(jié)點(diǎn)遞歸調(diào)用prepareTreeImpl(),如果是render layer,在RenderNode::pushLayerUpdate()中會(huì)將該layer的更新操作記錄到LayerUpdateQueue中。

再回到Task的run(),看接下來(lái)的CanvasContext.draw

void CanvasContext::draw() { 
... 
   Frame frame = mRenderPipeline->getFrame(); 
   SkRect windowDirty = computeDirtyRect(frame, &dirty); 
   bool drew = mRenderPipeline->draw(frame, windowDirty, dirty, mLightGeometry, &mLayerUpdateQueue, 
        mContentDrawBounds, mOpaque, mLightInfo, mRenderNodes, &(profiler())); 
   waitOnFences(); 
   bool requireSwap = false; 
  bool didSwap = mRenderPipeline->swapBuffers(frame, drew, windowDirty, mCurrentFrameInfo, 
        &requireSwap); 
... 
}

這里由mRenderPipeline執(zhí)行了三個(gè)核心邏輯分別是:getFrame draw 和 swapBuffer . 分別來(lái)看一下:
(mRenderPipeline對(duì)應(yīng)的是frameworks/base/libs/hwui/renderthread/OpenGLPipeline.cpp)

getFrame過(guò)程: 主要就是dequeueBuffer

Frame OpenGLPipeline::getFrame() {
   LOG_ALWAYS_FATAL_IF(mEglSurface == EGL_NO_SURFACE,
                       "drawRenderNode called on a context with no surface!");
   return mEglManager.beginFrame(mEglSurface);
}
Frame EglManager::beginFrame(EGLSurface surface) {
   LOG_ALWAYS_FATAL_IF(surface == EGL_NO_SURFACE, "Tried to beginFrame on EGL_NO_SURFACE!");
   makeCurrent(surface);
   Frame frame;
   frame.mSurface = surface;
   eglQuerySurface(mEglDisplay, surface, EGL_WIDTH, &frame.mWidth);
   eglQuerySurface(mEglDisplay, surface, EGL_HEIGHT, &frame.mHeight);
   frame.mBufferAge = queryBufferAge(surface);
   eglBeginFrame(mEglDisplay, surface);
   return frame;
}

看對(duì)應(yīng)的makeCurrent方法:

bool EglManager::makeCurrent(EGLSurface surface, EGLint* errOut) {
  ...
   if (!eglMakeCurrent(mEglDisplay, surface, surface, mEglContext)) {
       if (errOut) {
           *errOut = eglGetError();
           ALOGW("Failed to make current on surface %p, error=%s", (void*)surface,
                 egl_error_str(*errOut));
       } else {
           LOG_ALWAYS_FATAL("Failed to make current on surface %p, error=%s", (void*)surface,
                            eglErrorString());
       }
   }
   mCurrentSurface = surface;
   if (Properties::disableVsync) {
       eglSwapInterval(mEglDisplay, 0);
   }
   return true;
}

再看eglMakeCurrent,這里很明顯傳入了EGLSurface

frameworks/native/opengl/libagl/egl.cpp

EGLBoolean eglMakeCurrent(  EGLDisplay dpy, EGLSurface draw,
                           EGLSurface read, EGLContext ctx)
{
   ogles_context_t* gl = (ogles_context_t*)ctx;
   if (makeCurrent(gl) == 0) {
       if (ctx) {
           egl_context_t* c = egl_context_t::context(ctx);
           egl_surface_t* d = (egl_surface_t*)draw;
           egl_surface_t* r = (egl_surface_t*)read;
           ...
           if (d) {
           //dequeueBuffer相關(guān)邏輯
               if (d->connect() == EGL_FALSE) {
                   return EGL_FALSE;
               }
               d->ctx = ctx;
               d->bindDrawSurface(gl);//綁定Surface
           }
          ...
   return setError(EGL_BAD_ACCESS, EGL_FALSE);
}

再繼續(xù)跟d->connect()
d對(duì)應(yīng)的結(jié)構(gòu)體egl_surface_t 往下查connect的實(shí)現(xiàn):
發(fā)現(xiàn)又被egl_window_surface_v2_t實(shí)現(xiàn) : struct egl_window_surface_v2_t : public egl_surface_t
那看看egl_window_surface_v2_t的connect

EGLBoolean egl_window_surface_v2_t::connect()
{
    // dequeue a buffer
   int fenceFd = -1;
   if (nativeWindow->dequeueBuffer(nativeWindow, &buffer,
           &fenceFd) != NO_ERROR) {
       return setError(EGL_BAD_ALLOC, EGL_FALSE);
   }
   // wait for the buffer
   sp<Fence> fence(new Fence(fenceFd));
   ...
   return EGL_TRUE;
}

nativeWindow對(duì)應(yīng)的就是Surface

frameworks/native/libs/gui/Surface.cpp

Surface::Surface(
       const sp<IGraphicBufferProducer>& bufferProducer,
       bool controlledByApp)
ANativeWindow::dequeueBuffer    = hook_dequeueBuffer;
}
int Surface::hook_dequeueBuffer(ANativeWindow* window,
       ANativeWindowBuffer** buffer, int* fenceFd) {
   Surface* c = getSelf(window);
   return c->dequeueBuffer(buffer, fenceFd);
}
int Surface::dequeueBuffer(android_native_buffer_t** buffer, int* fenceFd) {
         ATRACE_CALL(); / /這里就是對(duì)應(yīng)systrace的標(biāo)簽了
       ALOGV("Surface::dequeueBuffer");
       ...
   FrameEventHistoryDelta frameTimestamps;
   status_t result = mGraphicBufferProducer->dequeueBuffer(&buf, &fence, reqWidth, reqHeight,
                                                           reqFormat, reqUsage, &mBufferAge,
                                                           enableFrameTimestamps ? &frameTimestamps
                                                                                 : nullptr);
   ... 如果需要重新分配,則requestBuffer,請(qǐng)求分配
   if ((result & IGraphicBufferProducer::BUFFER_NEEDS_REALLOCATION) || gbuf == nullptr) {
       //通過(guò)GraphicBufferProducer來(lái)申請(qǐng)buffer
       result = mGraphicBufferProducer->requestBuffer(buf, &gbuf);
      }
   ...
}

draw過(guò)程: 遞歸issue OpenGL命令,提交給GPU繪制

bool OpenGLPipeline::draw(const Frame& frame, const SkRect& screenDirty, const SkRect& dirty,
                         const FrameBuilder::LightGeometry& lightGeometry,
                         LayerUpdateQueue* layerUpdateQueue, const Rect& contentDrawBounds,
                         bool opaque, bool wideColorGamut,
                         const BakedOpRenderer::LightInfo& lightInfo,
                         const std::vector<sp<RenderNode>>& renderNodes,
                         FrameInfoVisualizer* profiler) {
   mEglManager.damageFrame(frame, dirty);
   bool drew = false;
   auto& caches = Caches::getInstance();
   FrameBuilder frameBuilder(dirty, frame.width(), frame.height(), lightGeometry, caches);
   frameBuilder.deferLayers(*layerUpdateQueue);
   layerUpdateQueue->clear();
   frameBuilder.deferRenderNodeScene(renderNodes, contentDrawBounds);
   BakedOpRenderer renderer(caches, mRenderThread.renderState(), opaque, wideColorGamut,//
                            lightInfo);
   frameBuilder.replayBakedOps<BakedOpDispatcher>(renderer);
   ProfileRenderer profileRenderer(renderer);
   profiler->draw(profileRenderer);
   drew = renderer.didDraw();
   // post frame cleanup
   caches.clearGarbage();
   caches.pathCache.trim();
   caches.tessellationCache.trim();
#if DEBUG_MEMORY_USAGE
   caches.dumpMemoryUsage();
#else
if (CC_UNLIKELY(Properties::debugLevel & kDebugMemory)) {
       caches.dumpMemoryUsage();
   }
#endif
return drew;
}

這部分邏輯非常復(fù)雜. 就不跟代碼流程了,簡(jiǎn)單梳理下:

OpenGLPipeline::draw過(guò)程

from jinzhuojun
  • defer: 數(shù)據(jù)結(jié)構(gòu)的重新整合, RenderNode 對(duì)應(yīng)成LayerBuilder, 內(nèi)部DisplayList以chunk為單位組合的RecordedOp重新封裝成BakedOpState,保存到Batch, 按能合并和不能合并:mMergingBatchLookup和mBatchLookup兩張表來(lái)索引.
  • render: 將Op轉(zhuǎn)化為對(duì)應(yīng)的OpenGL命令,并緩存在本地的GL命令緩沖區(qū)中.
  • GL call: 將OpenGL命令提交給GPU執(zhí)行

swapBuffer過(guò)程:將繪制好的數(shù)據(jù)提交給SurfaceFlinger去合成.

bool OpenGLPipeline::swapBuffers(const Frame& frame, bool drew, const SkRect& screenDirty,
                                FrameInfo* currentFrameInfo, bool* requireSwap) {
   GL_CHECKPOINT(LOW);
   // Even if we decided to cancel the frame, from the perspective of jank
// metrics the frame was swapped at this point
currentFrameInfo->markSwapBuffers();
   *requireSwap = drew || mEglManager.damageRequiresSwap();
   if (*requireSwap && (CC_UNLIKELY(!mEglManager.swapBuffers(frame, screenDirty)))) {
       return false;
   }
   return *requireSwap;
}

接著看swapBuffers

frameworks/base/libs/hwui/renderthread/EglManager.cpp

bool EglManager::swapBuffers(const Frame& frame, const SkRect& screenDirty) {
...
   eglSwapBuffersWithDamageKHR(mEglDisplay, frame.mSurface, rects, screenDirty.isEmpty() ? 0 : 1);
...
return false;
}

eglSwapBuffersWithDamageKHR方法對(duì)應(yīng)了systrace eglSwapBuffersWithDamageKHR標(biāo)簽.主要干的事就是queueBuffer,不詳細(xì)分析了,分析方法類似dequeueBuffer,直接看最后的調(diào)用點(diǎn):

frameworks/native/opengl/libs/EGL/eglApi.cpp

EGLBoolean egl_window_surface_v2_t::swapBuffers()
{
   ...
   nativeWindow->queueBuffer(nativeWindow, buffer, -1);
   buffer = 0;
   // dequeue a new buffer
   int fenceFd = -1;
   if (nativeWindow->dequeueBuffer(nativeWindow, &buffer, &fenceFd) == NO_ERROR) {
       sp<Fence> fence(new Fence(fenceFd));
       // fence->wait
       if (fence->wait(Fence::TIMEOUT_NEVER)) {
           nativeWindow->cancelBuffer(nativeWindow, buffer, );
           return setError(EGL_BAD_ALLOC, EGL_FALSE);
       }
       ...
}

這里先將保持好繪制數(shù)據(jù)的Buffer 通過(guò)queueBuffer把Buffer投入BufferQueue ,并通知SurfaceFlinger去BufferQueue中acquire Buffer出來(lái)合成,然后再通過(guò)dequeueBuffer申請(qǐng)一塊Buffer用來(lái)處理下一次請(qǐng)求。這里是典型的生產(chǎn)者消費(fèi)者過(guò)程,之前圖形系統(tǒng)的文章多次說(shuō)過(guò)了,有興趣的可以去看之前的圖形系統(tǒng)文章Android圖形系統(tǒng)篇總結(jié)。

后續(xù)就是SurfaceFlinger的合成操作了,可以參考之前圖形系統(tǒng)文章:
Android圖形系統(tǒng)(十)-SurfaceFlinger啟動(dòng)及圖層合成送顯過(guò)程

注:
之前圖形系統(tǒng)文章不是9.0的,屬于初期學(xué)習(xí)總結(jié),重在捋思路和流程。

最后簡(jiǎn)單總結(jié)下整個(gè)硬件加速繪制和渲染的流程圖,我懂的,一圖勝千言:

硬件加速繪制和渲染的流程

繪制和渲染過(guò)程是整個(gè)硬件加速的核心,流程圖只涵蓋了核心調(diào)用流程,要吃透這部分內(nèi)容還需要花大量時(shí)間去源碼中研究和摳細(xì)節(jié),我這只算拋磚引玉了,有問(wèn)題歡迎交流,覺(jué)得文章還可以的麻煩點(diǎn)個(gè)贊。

最后談幾點(diǎn)硬件加速設(shè)計(jì)優(yōu)點(diǎn):

  • 繪制命令到GL命令之間引入DisplayList"中間命令",起到一個(gè)緩沖作用, 當(dāng)需要繪制時(shí)才轉(zhuǎn)化為GL命令.在View中只針對(duì)視圖臟區(qū)域做RenderNode和RenderProperties調(diào)整即可,最后更新DisplayList,從而避免重復(fù)繪制和數(shù)據(jù)組織操作。

  • 對(duì)繪制操作進(jìn)行batch/merge可以減少GL的draw call,從而減少渲染狀態(tài)切換,提高了性能。

  • 將渲染任務(wù)轉(zhuǎn)到RenderThread,進(jìn)一步解放UIThread. 同時(shí)將CPU不擅長(zhǎng)的圖形計(jì)算轉(zhuǎn)換成GPU專用指令由GPU完成, 也進(jìn)一步提升了渲染效率.

另外推薦兩篇參考過(guò)的好文:
http://www.itdecent.cn/p/dd800800145b
https://blog.csdn.net/jinzhuojun/article/details/54234354

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。
禁止轉(zhuǎn)載,如需轉(zhuǎn)載請(qǐng)通過(guò)簡(jiǎn)信或評(píng)論聯(lián)系作者。

友情鏈接更多精彩內(nèi)容