Speech-based Depression Detection Using Unsupervised Autoencoder

Guangyao Sun, Shenghui Zhao, Bochao Zou*, Yubo An

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Citations (Scopus)

Abstract

With the rapid development of society, over three hundred million people worldwide suffer from depression, which has become one of the most serious health problems in the world. As we know, depression detection is of great importance for its timely treatment. In this paper, a speech-based depression detection method using unsupervised autoencoder is proposed. Most previous methods encode the frame-level speech features into sentence-level features with statistical functions which lead to the loss of the temporal information between frames. To solve this, we propose an unsupervised network based on transformer. The unsupervised network is adopted to obtain the audio embedding vector of an audio segment from depressed or non-depressed people. Then the embedding audio vector is used for depression detection. The experimental results show that the proposed method achieves superior performance on both the English database DAIC and our self-built Chinese database CMDC.

Original languageEnglish
Title of host publication2022 7th International Conference on Signal and Image Processing, ICSIP 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages35-38
Number of pages4
ISBN (Electronic)9781665495639
DOIs
Publication statusPublished - 2022
Event7th International Conference on Signal and Image Processing, ICSIP 2022 - Suzhou, China
Duration: 20 Jul 202222 Jul 2022

Publication series

Name2022 7th International Conference on Signal and Image Processing, ICSIP 2022

Conference

Conference7th International Conference on Signal and Image Processing, ICSIP 2022
Country/TerritoryChina
CitySuzhou
Period20/07/2222/07/22

Keywords

  • Chinese depression database
  • depression detection
  • transformer
  • unsupervised learning

Fingerprint

Dive into the research topics of 'Speech-based Depression Detection Using Unsupervised Autoencoder'. Together they form a unique fingerprint.

Cite this