Joint low rank embedded multiple features learning for audio–visual emotion recognition

Zhan Wang, Lizhi Wang*, Hua Huang

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

12 引用 (Scopus)

摘要

Audio–visual emotion recognition is a challenging problem in the research fields of human–computer interaction and pattern recognition. Seeking a common subspace among the heterogeneous multi-modal data is essential for audio–visual emotion recognition. In this paper, we study the subspace learning for audio–visual emotion recognition by combing the similarity of intra-modality and the correlation of inter-modality. First, we enforce a low-rank constraint on the self-representation of the features in the subspace to exploit the structural similarity of intra-modality. It is based on a key observation that each modality and the corresponding features usually lie in a low-dimensional manifold. Second, we propose a joint low-rank model on the representation of inter-modality to keep consistency across different modalities. Finally, the intra-modality similarity and inter-modality correlation are integrated within a unified framework, for which we develop an efficient computational algorithm to pursue the common subspace. Experimental results on three typical audio–visual emotion datasets demonstrate the superior performance of our method on audio–visual emotion recognition.

源语言英语
页(从-至)324-333
页数10
期刊Neurocomputing
388
DOI
出版状态已出版 - 7 5月 2020

指纹

探究 'Joint low rank embedded multiple features learning for audio–visual emotion recognition' 的科研主题。它们共同构成独一无二的指纹。

引用此