Multimodal depression detection using a deep feature fusion network

Guangyao Sun, Shenghui Zhao, Bochao Zou*, Yubo An

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

Currently, more and more people are suffering from depression with the increase of social pressure, which has become one of the most severe health issues worldwide. Therefore, timely diagonosis of depression is very important. In this paper, a deep feature fusion network is proposed for multimodal depression detection. Firstly, an unsupervised autoencoder based on transformer is applied to derive the sentence-level embedding for the frame-level audiovisual features; then a deep feature fusion network based on a cross-modal transformer is proposed to fuse the text, audio and video features. The experimental results show that the proposed method achieves superior performance compared to state-of-the-art methods on the English database DAIC-WOZ.

Original languageEnglish
Title of host publicationThird International Conference on Computer Science and Communication Technology, ICCSCT 2022
EditorsYingfa Lu, Changbo Cheng
PublisherSPIE
ISBN (Electronic)9781510661240
DOIs
Publication statusPublished - 2022
Event3rd International Conference on Computer Science and Communication Technology, ICCSCT 2022 - Beijing, China
Duration: 30 Jul 202231 Jul 2022

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume12506
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference3rd International Conference on Computer Science and Communication Technology, ICCSCT 2022
Country/TerritoryChina
CityBeijing
Period30/07/2231/07/22

Keywords

  • Depression detection
  • multimodal feature fusion
  • transformer
  • unsupervised learning

Fingerprint

Dive into the research topics of 'Multimodal depression detection using a deep feature fusion network'. Together they form a unique fingerprint.

Cite this