跳到主要导航 跳到搜索 跳到主要内容

An adaptive embedding personalized speech information model based on contrastive learning for depression level assessment

  • Zhenyu Liu
  • , Jiaqian Yuan
  • , Yang Wu
  • , Bailin Chen
  • , Luyue Yang
  • , Hanshu Cai
  • , Jiahui Deng
  • , Lin Liu
  • , Yimiao Zhao
  • , Huan Mei
  • , Yanping Bao*
  • , Bin Hu*
  • *此作品的通讯作者
  • Lanzhou University
  • Peking University

科研成果: 期刊稿件文章同行评审

摘要

Speech-based depression detection has become a hot research topic. Personalized information such as personality and speaking style may cause overlap of speech features among individuals with varying depression levels, raising the risk of model misclassification. A potential strategy is to specifically account for the effect of personalized information during modeling to leverage its intrinsic depression cues. Accordingly, we proposed the Adaptive Embedding Personalized Information Model (AEPIM), which comprises three modules: the Personalized Information Extraction Module (PIEM), the Depression Information Extraction Module (DIEM), and the Self-Adaptive Fusion Module (SAFM). PIEM employs contrastive learning to extract personalized information from longitudinal data. DIEM and SAFM are then trained jointly to learn more discriminative depression representations. To validate AEPIM's effectiveness, we constructed a longitudinal dataset containing two rounds of data, which is rare in this field. Such data are crucial for analyzing personalized information, supporting the establishment of accurate relationships between speech features and depression levels, thereby assisting in individualized depression diagnosis. Experimental results demonstrate that AEPIM outperforms existing methods, reducing RMSE and MAE by at least 13.8% and 13.0% in Round 1, and by 7.5% and 5.2% in Round 2, respectively. Out-of-domain generalization was assessed on two cross-sectional datasets, indicating its effectiveness on external data. These improvements suggest that AEPIM holds significant potential for practical applications, such as long-term monitoring of depression. The code is available at https://github.com/yuanjq2023-stack/AEPIM.

源语言英语
文章编号115372
期刊Applied Soft Computing
199
DOI
出版状态已出版 - 8月 2026
已对外发布

指纹

探究 'An adaptive embedding personalized speech information model based on contrastive learning for depression level assessment' 的科研主题。它们共同构成独一无二的指纹。

引用此