Abstract
Speech-based depression detection has become a hot research topic. Personalized information such as personality and speaking style may cause overlap of speech features among individuals with varying depression levels, raising the risk of model misclassification. A potential strategy is to specifically account for the effect of personalized information during modeling to leverage its intrinsic depression cues. Accordingly, we proposed the Adaptive Embedding Personalized Information Model (AEPIM), which comprises three modules: the Personalized Information Extraction Module (PIEM), the Depression Information Extraction Module (DIEM), and the Self-Adaptive Fusion Module (SAFM). PIEM employs contrastive learning to extract personalized information from longitudinal data. DIEM and SAFM are then trained jointly to learn more discriminative depression representations. To validate AEPIM's effectiveness, we constructed a longitudinal dataset containing two rounds of data, which is rare in this field. Such data are crucial for analyzing personalized information, supporting the establishment of accurate relationships between speech features and depression levels, thereby assisting in individualized depression diagnosis. Experimental results demonstrate that AEPIM outperforms existing methods, reducing RMSE and MAE by at least 13.8% and 13.0% in Round 1, and by 7.5% and 5.2% in Round 2, respectively. Out-of-domain generalization was assessed on two cross-sectional datasets, indicating its effectiveness on external data. These improvements suggest that AEPIM holds significant potential for practical applications, such as long-term monitoring of depression. The code is available at https://github.com/yuanjq2023-stack/AEPIM.
| Original language | English |
|---|---|
| Article number | 115372 |
| Journal | Applied Soft Computing |
| Volume | 199 |
| DOIs | |
| Publication status | Published - Aug 2026 |
| Externally published | Yes |
Keywords
- Attention mechanism
- Contrastive learning
- Depression information
- Personalized information
- Speech
Fingerprint
Dive into the research topics of 'An adaptive embedding personalized speech information model based on contrastive learning for depression level assessment'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver