An Automatic Depression Detection Method with Cross-Modal Fusion Network and Multi-head Attention Mechanism

Yutong Li, Juan Wang, Zhenyu Liu*, Li Zhou, Haibo Zhang, Cheng Tang, Xiping Hu, Bin Hu

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Audio-visual based multimodal depression detection has gained significant attention due to its high efficiency and convenience as a computer-aided detection tool, resulting in promising performance. In this paper, we propose a cross-modal fusion network based on multi-head attention and residual structures (CMAFN) for depression recognition. CMAFN consists of three core modules: the Local Temporal Feature Extract Block (LTF), the Cross-Model Fusion Block (CFB), and the Multi-Head Temporal Attention Block (MTB). The LTF module performs feature extraction and encodes temporal information for audio and video modalities separately, while the CFB module facilitates complementary learning between the modalities. The MTB module accounts for the temporal influence of all modalities on each unimodal branch. With the incorporation of the three well-designed modules, CMAFN can refine the inter-modality complementarity and intra-modality temporal dependencies, achieving the interaction between unimodal branches and adaptive balance between modalities. Evaluation results on widely used depression datasets, AVEC2013 and AVEC2014, demonstrate that the proposed CMAFN method outperforms state-of-the-art approaches for depression recognition tasks. The results highlight the potential of CMAFN as an effective tool for the early detection and diagnosis of depression.

源语言英语
主期刊名Pattern Recognition and Computer Vision - 6th Chinese Conference, PRCV 2023, Proceedings
编辑Qingshan Liu, Hanzi Wang, Rongrong Ji, Zhanyu Ma, Weishi Zheng, Hongbin Zha, Xilin Chen, Liang Wang
出版商Springer Science and Business Media Deutschland GmbH
252-264
页数13
ISBN(印刷版)9789819984688
DOI
出版状态已出版 - 2024
已对外发布
活动6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023 - Xiamen, 中国
期限: 13 10月 202315 10月 2023

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
14429 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023
国家/地区中国
Xiamen
时期13/10/2315/10/23

指纹

探究 'An Automatic Depression Detection Method with Cross-Modal Fusion Network and Multi-head Attention Mechanism' 的科研主题。它们共同构成独一无二的指纹。

引用此