2-level hierarchical depression recognition method based on task-stimulated and integrated speech features

Yujuan Xing; Zhenyu Liu; Gang Li; Zhi Jie Ding; Bin Hu

doi:10.1016/j.bspc.2021.103287

2-level hierarchical depression recognition method based on task-stimulated and integrated speech features

Yujuan Xing, Zhenyu Liu^*, Gang Li, Zhi Jie Ding, Bin Hu

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

8 引用（Scopus）

摘要

Depression had been paid more and more attention by researchers because of its high prevalence, recurrence, disability and mortality. Speech depression recognition had become a research hotspot due to its advantages of non-invasiveness and easy access to data. However, the problems such as the speech variation in different emotional stimulus, gender impact, the speaker and channel variation and the variable length of frame feature, would have a great impact on recognition performance. In order to solve these problems, a novel 2-level hierarchical depression recognition method was proposed in this paper. It contained two stages. In 1^st-level classification stage, i-vectors were extracted based on spectral features, prosodic features, formants and voice quality of speech segments in different task stimulus respectively. Then, support vector machine (SVM) and random forest (RF) were used to obtain primary results. In the stage of 2^nd-level classification, the results of tasks with significant accuracy differences were aggregated into new integrated features. The final result was achieved on new features by SVM. Our experiments were based on the depression speech database of the Gansu Provincial Key Laboratory of Wearable Computing. The experimental results showed that the proposed method had achieved good results in both gender-independent and gender-dependent experiments. Compared with baseline method and bagging classification, the highest accuracy of our method was raised by 9.62% and 9.49% respectively in gender-independent experiments, and F1 score also got improvement obviously. The results also showed that our method had better robustness on gender effect.

源语言	英语
文章编号	103287
期刊	Biomedical Signal Processing and Control
卷	72
DOI	https://doi.org/10.1016/j.bspc.2021.103287
出版状态	已出版 - 2月 2022
已对外发布	是

访问文件

10.1016/j.bspc.2021.103287

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{7e50dc061b1743ce8d8d1f2afba9d7f8,

title = "2-level hierarchical depression recognition method based on task-stimulated and integrated speech features",

abstract = "Depression had been paid more and more attention by researchers because of its high prevalence, recurrence, disability and mortality. Speech depression recognition had become a research hotspot due to its advantages of non-invasiveness and easy access to data. However, the problems such as the speech variation in different emotional stimulus, gender impact, the speaker and channel variation and the variable length of frame feature, would have a great impact on recognition performance. In order to solve these problems, a novel 2-level hierarchical depression recognition method was proposed in this paper. It contained two stages. In 1st-level classification stage, i-vectors were extracted based on spectral features, prosodic features, formants and voice quality of speech segments in different task stimulus respectively. Then, support vector machine (SVM) and random forest (RF) were used to obtain primary results. In the stage of 2nd-level classification, the results of tasks with significant accuracy differences were aggregated into new integrated features. The final result was achieved on new features by SVM. Our experiments were based on the depression speech database of the Gansu Provincial Key Laboratory of Wearable Computing. The experimental results showed that the proposed method had achieved good results in both gender-independent and gender-dependent experiments. Compared with baseline method and bagging classification, the highest accuracy of our method was raised by 9.62% and 9.49% respectively in gender-independent experiments, and F1 score also got improvement obviously. The results also showed that our method had better robustness on gender effect.",

keywords = "Bagging classification, Depression recognition, Hierarchical classification, I-vector, Speech task stimulus",

author = "Yujuan Xing and Zhenyu Liu and Gang Li and Ding, {Zhi Jie} and Bin Hu",

note = "Publisher Copyright: {\textcopyright} 2021",

year = "2022",

month = feb,

doi = "10.1016/j.bspc.2021.103287",

language = "English",

volume = "72",

journal = "Biomedical Signal Processing and Control",

issn = "1746-8094",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - 2-level hierarchical depression recognition method based on task-stimulated and integrated speech features

AU - Xing, Yujuan

AU - Liu, Zhenyu

AU - Li, Gang

AU - Ding, Zhi Jie

AU - Hu, Bin

PY - 2022/2

Y1 - 2022/2

N2 - Depression had been paid more and more attention by researchers because of its high prevalence, recurrence, disability and mortality. Speech depression recognition had become a research hotspot due to its advantages of non-invasiveness and easy access to data. However, the problems such as the speech variation in different emotional stimulus, gender impact, the speaker and channel variation and the variable length of frame feature, would have a great impact on recognition performance. In order to solve these problems, a novel 2-level hierarchical depression recognition method was proposed in this paper. It contained two stages. In 1st-level classification stage, i-vectors were extracted based on spectral features, prosodic features, formants and voice quality of speech segments in different task stimulus respectively. Then, support vector machine (SVM) and random forest (RF) were used to obtain primary results. In the stage of 2nd-level classification, the results of tasks with significant accuracy differences were aggregated into new integrated features. The final result was achieved on new features by SVM. Our experiments were based on the depression speech database of the Gansu Provincial Key Laboratory of Wearable Computing. The experimental results showed that the proposed method had achieved good results in both gender-independent and gender-dependent experiments. Compared with baseline method and bagging classification, the highest accuracy of our method was raised by 9.62% and 9.49% respectively in gender-independent experiments, and F1 score also got improvement obviously. The results also showed that our method had better robustness on gender effect.

AB - Depression had been paid more and more attention by researchers because of its high prevalence, recurrence, disability and mortality. Speech depression recognition had become a research hotspot due to its advantages of non-invasiveness and easy access to data. However, the problems such as the speech variation in different emotional stimulus, gender impact, the speaker and channel variation and the variable length of frame feature, would have a great impact on recognition performance. In order to solve these problems, a novel 2-level hierarchical depression recognition method was proposed in this paper. It contained two stages. In 1st-level classification stage, i-vectors were extracted based on spectral features, prosodic features, formants and voice quality of speech segments in different task stimulus respectively. Then, support vector machine (SVM) and random forest (RF) were used to obtain primary results. In the stage of 2nd-level classification, the results of tasks with significant accuracy differences were aggregated into new integrated features. The final result was achieved on new features by SVM. Our experiments were based on the depression speech database of the Gansu Provincial Key Laboratory of Wearable Computing. The experimental results showed that the proposed method had achieved good results in both gender-independent and gender-dependent experiments. Compared with baseline method and bagging classification, the highest accuracy of our method was raised by 9.62% and 9.49% respectively in gender-independent experiments, and F1 score also got improvement obviously. The results also showed that our method had better robustness on gender effect.

KW - Bagging classification

KW - Depression recognition

KW - Hierarchical classification

KW - I-vector

KW - Speech task stimulus

UR - http://www.scopus.com/inward/record.url?scp=85118557269&partnerID=8YFLogxK

U2 - 10.1016/j.bspc.2021.103287

DO - 10.1016/j.bspc.2021.103287

M3 - Article

AN - SCOPUS:85118557269

SN - 1746-8094

VL - 72

JO - Biomedical Signal Processing and Control

JF - Biomedical Signal Processing and Control

M1 - 103287

ER -

2-level hierarchical depression recognition method based on task-stimulated and integrated speech features

摘要

访问文件

其它文件与链接

指纹

引用此