TY - JOUR
T1 - LLM-based semantic integration of stimulus–response pairs for depression detection in interview scenarios
AU - Liu, Zhenyu
AU - Chen, Jiahang
AU - Chen, Bo
AU - Zhao, Bohua
AU - Zhang, Haibo
AU - Li, Gang
AU - Ding, Zhijie
AU - Hu, Bin
N1 - Publisher Copyright:
© 2025 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
PY - 2026/3
Y1 - 2026/3
N2 - Depression is becoming increasingly prevalent with the accelerating pace of life and rising psychological pressures. While text-based depression detection has gained attention, most existing approaches focus solely on participants’ responses, overlooking the semantic role of the prompting questions–an essential component in clinical assessments. Although some studies have attempted to incorporate stimulus information, they often lack targeted design and contextual depth, limiting their effectiveness. To address these limitations, we propose a prompt-based explicit fusion strategy that leverages large language models to accurately integrate the semantics of both stimuli and responses, thereby reducing context comprehension bias. Furthermore, we introduce the FMAN framework, which combines multi-scale semantic features from diverse pre-trained models and a dynamic focused attention module for implicit fusion. We evaluate our method on three datasets: MIDD (536 Chinese participants), DAIC* (189 English participants), and CMDC (78 Chinese participants). Our approach achieves accuracies of 87.31 %, 80.43 %, and 98.89 %, respectively, surpassing mainstream models. These results demonstrate the effectiveness of our framework and offer new insights into NLP-based depression detection in interview scenarios.
AB - Depression is becoming increasingly prevalent with the accelerating pace of life and rising psychological pressures. While text-based depression detection has gained attention, most existing approaches focus solely on participants’ responses, overlooking the semantic role of the prompting questions–an essential component in clinical assessments. Although some studies have attempted to incorporate stimulus information, they often lack targeted design and contextual depth, limiting their effectiveness. To address these limitations, we propose a prompt-based explicit fusion strategy that leverages large language models to accurately integrate the semantics of both stimuli and responses, thereby reducing context comprehension bias. Furthermore, we introduce the FMAN framework, which combines multi-scale semantic features from diverse pre-trained models and a dynamic focused attention module for implicit fusion. We evaluate our method on three datasets: MIDD (536 Chinese participants), DAIC* (189 English participants), and CMDC (78 Chinese participants). Our approach achieves accuracies of 87.31 %, 80.43 %, and 98.89 %, respectively, surpassing mainstream models. These results demonstrate the effectiveness of our framework and offer new insights into NLP-based depression detection in interview scenarios.
KW - Depression detection
KW - Large language models
KW - Natural language processing
KW - Semantic integration
KW - Stimulus-response
UR - https://www.scopus.com/pages/publications/105023820229
U2 - 10.1016/j.eswa.2025.130112
DO - 10.1016/j.eswa.2025.130112
M3 - Article
AN - SCOPUS:105023820229
SN - 0957-4174
VL - 300
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -