TY - JOUR
T1 - CAIINET
T2 - Neural network based on contextual attention and information interaction mechanism for depression detection
AU - Zhou, Li
AU - Liu, Zhenyu
AU - Yuan, Xiaoyan
AU - Shangguan, Zixuan
AU - Li, Yutong
AU - Hu, Bin
N1 - Publisher Copyright:
© 2023
PY - 2023/6/15
Y1 - 2023/6/15
N2 - Depression is a globally widespread psychological disorder that has a serious impact on the physical and mental health of patients. Currently, depression detection methods based on physiological signals are widely used, but the limitation is that physiological signals are not easy to collect. With the rapid development of social media, vlogs posted by users not only reflect the current emotional state, but also provide the possibility of early depression detection, and the data are more easily obtained. Therefore, early depression detection based on social media has become a hot research topic. However, due to the large and diverse social data that users may publish, how to effectively extract critical temporal information and fuse multiple modal data becomes an urgent problem to be solved. To realize the early detection of depression on vlog data, we propose a neural network based on contextual attention and information interaction mechanism (CAIINET). CAIINET is composed of three core modules: BiLSTM based on contextual attention module (CAM-BilSTM), local information fusion module (LIFM), and global information interaction module (GIIM). The CAM-BilSTM model captures important acoustic and visual features at critical time points. The LIFM and GIIM modules extract the relevance and interactivity between extracted acoustic and visual features at local and global scales. Experiments are conducted on the D-Vlog dataset, and the CAIINET model achieves 66.56%, 66.98% and 66.55% for weighted average precision, recall and F1 score, respectively, outperforming the ten benchmark models. The experimental results show that the CAIINET model has good depression detection capability, and furthermore, the effectiveness of the three submodules of the CAIINET model is investigated by the ablation experiment.
AB - Depression is a globally widespread psychological disorder that has a serious impact on the physical and mental health of patients. Currently, depression detection methods based on physiological signals are widely used, but the limitation is that physiological signals are not easy to collect. With the rapid development of social media, vlogs posted by users not only reflect the current emotional state, but also provide the possibility of early depression detection, and the data are more easily obtained. Therefore, early depression detection based on social media has become a hot research topic. However, due to the large and diverse social data that users may publish, how to effectively extract critical temporal information and fuse multiple modal data becomes an urgent problem to be solved. To realize the early detection of depression on vlog data, we propose a neural network based on contextual attention and information interaction mechanism (CAIINET). CAIINET is composed of three core modules: BiLSTM based on contextual attention module (CAM-BilSTM), local information fusion module (LIFM), and global information interaction module (GIIM). The CAM-BilSTM model captures important acoustic and visual features at critical time points. The LIFM and GIIM modules extract the relevance and interactivity between extracted acoustic and visual features at local and global scales. Experiments are conducted on the D-Vlog dataset, and the CAIINET model achieves 66.56%, 66.98% and 66.55% for weighted average precision, recall and F1 score, respectively, outperforming the ten benchmark models. The experimental results show that the CAIINET model has good depression detection capability, and furthermore, the effectiveness of the three submodules of the CAIINET model is investigated by the ablation experiment.
KW - BiLSTM based on contextual attention (CAM-BilSTM)
KW - Depression detection
KW - Global information interaction module (GIIM)
KW - Local information fusion module (LIFM)
KW - Vlog
UR - http://www.scopus.com/inward/record.url?scp=85151564362&partnerID=8YFLogxK
U2 - 10.1016/j.dsp.2023.103986
DO - 10.1016/j.dsp.2023.103986
M3 - Article
AN - SCOPUS:85151564362
SN - 1051-2004
VL - 137
JO - Digital Signal Processing: A Review Journal
JF - Digital Signal Processing: A Review Journal
M1 - 103986
ER -