摘要
In this paper, a flexible deep learning-based framework is proposed that can extract expression and identity information from monocular images and can combine the extracted identity and expression from different images to generate new face models. In this framework, two encoders are used to extract expression and identity information, and three decoders are used to visualize the information by generating face models containing only expression, only identity, and fused expression and identity. By aligning the corresponding vertices of the parts with the same semantic on the face, an error evaluation method between face models with different topologies is proposed, which can more intuitively reflect the error distribution. The experimental results show that the proposed framework has higher accuracy than face component extraction by blendshape. The framework can be used for the facial expression generation of virtual humans, which is helpful for emotion transmission and language supplementation.
源语言 | 英语 |
---|---|
页(从-至) | 609-620 |
页数 | 12 |
期刊 | Journal of the Society for Information Display |
卷 | 30 |
期 | 8 |
DOI | |
出版状态 | 已出版 - 8月 2022 |