RMAvatar: Photorealistic Human Avatar from Monocular Video Based on Rectified Mesh-embedded Gaussians

RMAvatar: Photorealistic Human Avatar Reconstruction from Monocular
Video Based on Rectified Mesh-embedded Gaussians

Sen Peng¹, Weixing Xie², Zilong Wang³, Xiaohu Guo³, Zhonggui Chen⁴, Baorong Yang^{1, *} Xiao Dong^{5, *}

¹College of Computer Engineering, Jimei University
²National Institute for Data Science in Health and Medicine, Xiamen University
³Department of Computer Science, The University of Texas at Dallas
⁴School of Informatics, Xiamen University
⁵Guangdong Provincial/Zhuhai Key Laboratory of IRADS, BNU-HKBU United International College

Abstract

We introduce RMAvatar, a novel human avatar representation with Gaussian splatting embedded on mesh to learn clothed avatar from a monocular video. We utilize the explicit mesh geometry to represent motion and shape of a virtual human and implicit appearance rendering with Gaussian Splatting. Our method consists of two main modules: Gaussian initialization module and Gaussian rectification module. We embed Gaussians into triangular faces and control their motion through the mesh, which ensures low-frequency motion and surface deformation of the avatar. Due to the limitations of LBS formula, the human skeleton can only control rigid transformations. We design a pose-related Gaussian rectification module to learn non-rigid deformations of cloth and hair, further improving the realism and expressiveness of the avatar. We conduct extensive experiments on public datasets, RMAvatar shows state-of-the-art performance on both rendering quality and quantitative evaluations.