0、關(guān)鍵詞

synthetic dataset, 3D human pose and shape estimation,?SMPL-X

1、鏈接

該論文來自德國圖賓根大學(xué)（University of Tübingen）的馬克斯普朗克智能系統(tǒng)研究所（Max Planck Institute for Intelligent Systems），所長是著名CV教授Michael Black。MPII在CV領(lǐng)域十分高產(chǎn)，自是不用多提，論文中使用的大量baseline，同樣來自這個研究所已經(jīng)發(fā)布的研究成果。

論文鏈接：https://arxiv.org/pdf/2104.14643.pdf

論文主頁：https://agora.is.tue.mpg.de/index.html

論文代碼：https://github.com/pixelite1201/agora_evaluation

論文提供了一個全新的人工合成數(shù)據(jù)集AGORA。該數(shù)據(jù)集由逼真的3D場景和人體模型合成，包含有3D/2D人體關(guān)鍵點、3D人體模型、2D人體分割掩碼等豐富的標(biāo)注，主要用于3D人體姿態(tài)與形狀（3D human pose and shape，3DHPS）估計任務(wù)。更多細節(jié)請訪問論文主頁。

AGORA數(shù)據(jù)集樣例展示

2、主要內(nèi)容概述

※ Introduction

首先，論文在Introduction中重述了現(xiàn)在已公開的用于3DHPS任務(wù)的benchmarks的缺陷，例如，limited clothing, focus on single subjects, have limited occlusion, are captured in laboratory environments, or have a limited range of ages and ethnicities.，基于這些有“缺陷”的數(shù)據(jù)集，評價標(biāo)準(zhǔn)只能是3D joints，而不是更精確的body shapes，所以還有與AGORA數(shù)據(jù)集相關(guān)的新的評價標(biāo)準(zhǔn)evaluation protocol；

然后，作者開始介紹構(gòu)建AGORA的思路，其中有兩點最為重要。1）購買高質(zhì)量的帶有紋理的人體掃描模型（包括四個商業(yè)付費網(wǎng)站3DPeople、AXYZ、Human Alloy、Renderpeople），再依賴合成數(shù)據(jù)與圖形學(xué)渲染（rely on synthetic data and a graphics rendering pipeline），基于豐富的背景（HDRI panoramas and 3D environments），用Unreal游戲引擎合成了大量逼真的圖像。2）對于每一個3D人體掃描模型（一系列三維點），作者使用SMPL-X人體參數(shù)模型（由大量頂點和相互連接的三角形組成，CG中的常用3D模型）來精準(zhǔn)地匹配它的身體形狀，包括三個主要部分：整體（body）、手部（hand）、面部（facial）?！局档靡惶岬氖?，論文使用的模型SMPL-X，以及相關(guān)的fit方法SMPLify均來自它們自己的實驗室，研究傳承相當(dāng)緊密~】

※ Related Work

因為是發(fā)布數(shù)據(jù)集類的文章，作者主要強調(diào)了現(xiàn)有的人體相關(guān)數(shù)據(jù)集的“缺陷”。

Datasets with real images. 包括使用multiple synchronized cameras + optical markers構(gòu)建的數(shù)據(jù)集HumanEva, Human3.6M, and TotalCapture，他們的缺陷包括lack of background variation in lab scenarios, only one subject in each image, no scene occlusions, and little clothing variety due to the attachment of markers；另外，Alternatively，也有一些數(shù)據(jù)集構(gòu)建方法使用了marker-less motion capture，包括MuPoTS-3D, PanopticStudio, MPIINF-3DHP-Test, and HUMBI。這類方法則存在精度較marker-based方法更低的缺陷（due to yaw drift）。AGORA認為這樣得到的標(biāo)簽只能算是參考數(shù)據(jù)（reference data），不能用來作為GTs，作為比較，他們提出了的SMPL-X meshes具有高保真度（fidelity），可以作為偽標(biāo)簽（pseudo ground truth）。最后，上述數(shù)據(jù)集中，只有PanopticStudio和HUMBI兩個數(shù)據(jù)集包含了身體（bodies）標(biāo)簽的同時，還包括人臉（face）和手（hands）的標(biāo)簽。

Synthetic datasets. 首先提到合成數(shù)據(jù)集需要足夠真實（sufficiently realistic），并指出了具體的衡量點，包括body shape, ethnicity, motion, cloth deformation, texture, and interaction with environments. 數(shù)據(jù)集MHOF, LTSH, 3DPeople, and SURREAL直接在背景圖像中放置3D人體模型（如SMPL, MakeHuman, or Mixamo），而數(shù)據(jù)集MPI-INF-3DHPTrain and MuCo-3DHP則直接將分割得到的真人圖像作為前景粘貼到2D背景圖像中。作者認為這些方法生成的圖像都不夠真實，Such composition does not faithfully reflect the local statistics of pixel intensity in real images and does not support methods that learn how humans interact with scenes。唯一與AGORA比較相近的工作是SimPose，其構(gòu)建思路與作者類似，但作者認為其場景簡單、人體類別單一，且該數(shù)據(jù)集未公開。還有一些基于GAN方法的數(shù)據(jù)集如Human synthesis and scene compositing（AAAI2020）和Generating 3D people in scenes without people（CVPR2020），但生成模型得到的結(jié)果總會存在人工瑕疵（images?artifacts），合成圖像不適合（unsuitable）用來作為GTs。

Other human-related datasets. 還有一些數(shù)據(jù)集不包括3D人體標(biāo)注，但是包含2D人體關(guān)節(jié)點或人體分割掩碼標(biāo)注（如OCHuman，LSP-Extended, COCO, and MPII），或者使用真實圖像+擬合的人體模型作為GTs（如EFT（arxiv2020）, STRAPS（BMVC2020） and 3DOH50K（CVPR2020）），更進一步的，有些方法使用motion或multi-view matching來獲得視頻中復(fù)雜場景的人體模型。對于這類合成方式，作者認為盡管更魯棒，但with unknown accuracy in body shape and pose.

最后，作者總結(jié)道，AGORA是集大成者，解決了上述眾多缺陷AGORA provides realistic textures, complex body shapes and clothing, complex varied scenes and lighting, high-resolution (4K) imagery, varied occlusion, all with high-quality 3D ground truth.

AGORA與現(xiàn)有可用于3DHPS任務(wù)的數(shù)據(jù)集比較

※ Method: Obtaining reference data

作者的主要思路借鑒了三塊內(nèi)容：1）基于自己所在實驗室之前的研究成果人體模型SMPL-X body model，和擬合人體的方法single-view SMPLify-X fitting，來擬合人體主體形狀（包括body shape、face shape和hand shape），關(guān)于SMPL-X及SMPLify-X需要另外解讀；2）借鑒另一項工作，即帶有衣服的掃描人體的擬合方法fitting body shape under clothing（Detailed, accurate, human shape estimation from?clothed 3D scan sequences （CVPR2017）），來擬合人體的皮膚（skin and hair）和所穿的衣服（clothing）；3）再使用Graphonomy（CVPR2019）來人工地調(diào)整標(biāo)注皮膚和衣服相關(guān)的頂點（vertices）。最終復(fù)合多個terms得到目標(biāo)函數(shù)。

另外，作者發(fā)現(xiàn)小孩（child）的人體掃描并不能像成人（adults）那樣使用同一套人體擬合模型和方法，因此進一步提出了改進的方法，即使用SMIL（the mean infant body template）生成小孩的模板Tchild，再取其與成人模板Tadult的插值平均，細節(jié)見論文。

※?AGORA Dataset

AGORA數(shù)據(jù)集總計包括公開標(biāo)簽的14529張訓(xùn)練集圖像（包含2930個scans）、1225張驗證集圖像（包含259個scans），和未公開標(biāo)簽的3387張測試集圖像（包含1051個scans）。總計包含了4240個掃描人體模型（high-quality textured scans），其中有257個是兒童掃描體（child scans）。

另外，經(jīng)過再次的人工修正（manually curate），這4240個掃描體中，有3161個同時包含對齊的人體、人臉和手（those with well aligned body, face and hands (3161, BFH)），有1079個僅僅包含對齊的人體（those only with well aligned bodies (1079, B)）。這意味著，如果基于AGORA做body estimation相關(guān)的任務(wù)，可使用全部數(shù)據(jù)集，如果需要做face或hand相關(guān)的estimation任務(wù)，僅一部分數(shù)據(jù)可用。

Fitting Accuracy 關(guān)于數(shù)據(jù)集中，3D人體的擬合精確度，作者基于the high-quality 3D scans，提出了兩個考察關(guān)鍵因素Skin error和Penetrating clothing error。并用具體數(shù)值證明了偽標(biāo)簽的可信賴度（Thus, we believe that the SMPL-X fits provide valid pseudo ground truth.）

Evaluation metrics and?Evaluation protocol?常見的用于3DHPS方法的度量標(biāo)注，都需要在計算error之前，進行Procrustes alignment，制定的標(biāo)注忽視了3D物體的尺寸、位移和旋轉(zhuǎn)（eliminates discrepancies in scale, translation and rotation, measuring only the error in poses (PA-MPJPE) and shapes (PA-MVE/V2V)）。作者指出，這是因為當(dāng)前HPS數(shù)據(jù)集的構(gòu)成造成的，因為他們只有pose和shape的標(biāo)注，而AGORA包含了complete 3D pseudo ground truth:?body parameters of each person and their spatial arrangement in the 3D scene，足夠支持更全面的誤差度量。因此，作者不采用Procrustes alignment，并提出了一系列新的度量指標(biāo)（包括MPJPE、MVE、NMJE和NMVE），用于單張圖像中多人的pose和shape的估計誤差度量，細節(jié)見論文。

※?Experiments

由于是數(shù)據(jù)集文章，作者需要強調(diào)的是數(shù)據(jù)集的優(yōu)越性。作者設(shè)計了兩種實驗，一種是在AGORA上測試SOTA的3DHPS方法，來證明該數(shù)據(jù)集是否能反映當(dāng)前領(lǐng)域的問題（指數(shù)據(jù)集確實更具有挑戰(zhàn)性？）；另一種是測試AGORA是否能用來作為預(yù)訓(xùn)練數(shù)據(jù)集，幫助提升SOTA方法性能?！窘嵌群退悸分档媒梃b~】

Baseline Evaluation. 通過比較發(fā)現(xiàn)，SOTA方法在舊的指標(biāo)MPJPE和MVE下的表現(xiàn)好，但新的指標(biāo)NMJE和NMVE下表現(xiàn)差，指出MPJPE alone is not enough to evaluate performance on multi-person images，說明AGORA揭示了多人姿態(tài)估計中的誤檢和漏檢問題；接著，SMPL-X模型下，各個SOTA方法的表現(xiàn)并不統(tǒng)一，沒有各項指標(biāo)均表現(xiàn)最好的，于是提出了消融研究（ablation studies），討論各項參數(shù)的影響occlusion, child shape, distance to the center of the image and orientation。【學(xué)會如何在自己的文章中，做此類消融實驗】

Baseline Improvement. 作者選用三種不同的模型擬合方式，即Pretrained SPIN、SPIN finetuned with AGORA and EFT([MPII+LSPet+COCO])和論文提出的基于SPIN-ft的模型，然后再在兩個數(shù)據(jù)集合（3DPW和AGORA）上進行測試，通過實驗數(shù)據(jù)證明Training with AGORA leads to significant improvement in performance on both datasets。

※?Conclusions and Future Work

We have presented AGORA, a new dataset that goes beyond current datasets to include challenging cases of environmental occlusion, person-person occlusion, scale variation, children, crowds, etc. AGORA is challenging and reveals limitations of existing methods. Despite being synthetic, fine-tuning on AGORA improves performance of a SOTA method on the natural 3DPW dataset. We introduce a new metric to include misses and false positives and facilitate analysis of the SOTA methods on images with multiple people. We also introduce a simple child body model and provide better 3D ground truth for images with children. Future work should include adding images of varied camera height, indoor scenes, multi-view images, larger crowds, animals, and movement. 【貼個原文，學(xué)習(xí)一下總結(jié)方式】

3、新穎點

盡管是數(shù)據(jù)集論文，沒有大量的公式，也沒有提出花里胡哨的新算法，但是卻解決了3DHPS任務(wù)中，始終沒有真實且有可靠的GTs的多人姿態(tài)估計數(shù)據(jù)集的痛點?？梢灶A(yù)見，這將推動一大批更高效的3DHPS方法的誕生。另外，本文有兩個有趣之處：

1）站在巨人的肩膀上。AGORA的主要重難點在于生成并擬合逼真的3D人體模型，SMPL-X模型是關(guān)鍵，而其是現(xiàn)成的工作，且本就是作者自己所在實驗室的成果，這個巨人正是自己的MPII實驗室，提出AGORA事半功倍；

2）填補空白。盡管作者在文章中多次強調(diào)了AGORA如何解決了之前眾多數(shù)據(jù)集的缺陷，但我認為，其主要貢獻是融合了multi-person和in the wild兩大特點，這是文章在做數(shù)據(jù)集對比時最大的優(yōu)勢。當(dāng)然，在對比中也能發(fā)現(xiàn)，AGORA確實在各項指標(biāo)中，都做到了更完備，因此可以作為可靠的挑戰(zhàn)性更大的benchmark。

4、總結(jié)

AGORA數(shù)據(jù)集雖然被用來解決3DHPS問題，但其提供的豐富的標(biāo)注GTs，可以使得其被用作做很多其它與人體相關(guān)的下游CV任務(wù)，包括2D mutli-person pose estimation、instance segmentation、hand keypoints detection、face landmarks detection、head pose estimation等。

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

CVPR2021 AGORA: Avatars in Geography Optimized for Regression Analysis

CVPR2021 AGORA: Avatars in Geography Optimized for Regression Analysis

0、關(guān)鍵詞

1、鏈接

2、主要內(nèi)容概述

※ Introduction

※ Related Work

※ Method: Obtaining reference data

※?AGORA Dataset

※?Experiments

※?Conclusions and Future Work

3、新穎點

4、總結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

CVPR2021 AGORA: Avatars in Geography Optimized for Regression Analysis

0、關(guān)鍵詞

1、鏈接

2、主要內(nèi)容概述

※ Introduction

※ Related Work

※ Method: Obtaining reference data

※?AGORA Dataset

※?Experiments

※?Conclusions and Future Work

3、新穎點

4、總結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

0、關(guān)鍵詞

1、鏈接

3、新穎點

4、總結(jié)