Skip to main navigation Skip to search Skip to main content

An adaptive embedding personalized speech information model based on contrastive learning for depression level assessment

  • Zhenyu Liu
  • , Jiaqian Yuan
  • , Yang Wu
  • , Bailin Chen
  • , Luyue Yang
  • , Hanshu Cai
  • , Jiahui Deng
  • , Lin Liu
  • , Yimiao Zhao
  • , Huan Mei
  • , Yanping Bao*
  • , Bin Hu*
  • *Corresponding author for this work
  • Lanzhou University
  • Peking University

Research output: Contribution to journalArticlepeer-review

Abstract

Speech-based depression detection has become a hot research topic. Personalized information such as personality and speaking style may cause overlap of speech features among individuals with varying depression levels, raising the risk of model misclassification. A potential strategy is to specifically account for the effect of personalized information during modeling to leverage its intrinsic depression cues. Accordingly, we proposed the Adaptive Embedding Personalized Information Model (AEPIM), which comprises three modules: the Personalized Information Extraction Module (PIEM), the Depression Information Extraction Module (DIEM), and the Self-Adaptive Fusion Module (SAFM). PIEM employs contrastive learning to extract personalized information from longitudinal data. DIEM and SAFM are then trained jointly to learn more discriminative depression representations. To validate AEPIM's effectiveness, we constructed a longitudinal dataset containing two rounds of data, which is rare in this field. Such data are crucial for analyzing personalized information, supporting the establishment of accurate relationships between speech features and depression levels, thereby assisting in individualized depression diagnosis. Experimental results demonstrate that AEPIM outperforms existing methods, reducing RMSE and MAE by at least 13.8% and 13.0% in Round 1, and by 7.5% and 5.2% in Round 2, respectively. Out-of-domain generalization was assessed on two cross-sectional datasets, indicating its effectiveness on external data. These improvements suggest that AEPIM holds significant potential for practical applications, such as long-term monitoring of depression. The code is available at https://github.com/yuanjq2023-stack/AEPIM.

Original languageEnglish
Article number115372
JournalApplied Soft Computing
Volume199
DOIs
Publication statusPublished - Aug 2026
Externally publishedYes

Keywords

  • Attention mechanism
  • Contrastive learning
  • Depression information
  • Personalized information
  • Speech

Fingerprint

Dive into the research topics of 'An adaptive embedding personalized speech information model based on contrastive learning for depression level assessment'. Together they form a unique fingerprint.

Cite this