Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model

Zhihao Bao; Kun Qian; Zhonghao Zhao; Mengkai Sun; Ruolan Huang; Dewen Xu; Bin Hu; Yoshiharu Yamamoto; Bjorn W. Schuller

doi:10.1109/EMBC40787.2023.10340705

Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model

Zhihao Bao^*, Kun Qian^*, Zhonghao Zhao, Mengkai Sun, Ruolan Huang^*, Dewen Xu, Bin Hu, Yoshiharu Yamamoto, Bjorn W. Schuller

^*Corresponding author for this work

School of Medical and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

With the depressive psychiatric disorders becoming more common, people are gradually starting to take it seriously. Somatisation disorders, as a general mental disorder, are rarely accurately identified in clinical diagnosis for its specific nature. In the previous work, speech recognition technology has been successfully applied to the task of identifying somatisation disorders on the Shenzhen Somatisation Speech Corpus. Nevertheless, there is still a scarcity of labels for somatisation disorder speech database. The current mainstream approaches in the speech recognition heavily rely on the well labelled data. Compared to supervised learning, self-supervised learning is able to achieve the same or even better recognition results while reducing the reliance on labelled samples. Moreover, self-supervised learning can generate general representations without the need for human hand-crafted features depending on the different recognition tasks. To this end, we apply self-supervised learning pre-trained models to solve few-labelled somatisation disorder speech recognition. In this study, we compare and analyse the results of three self-supervised learning models (contrastive predictive coding, wav2vec and wav2vec 2.0). The best result of wav2vec 2.0 model achieves 77.0 % unweighted average recall and is significantly better than CPC (p <.005), performing better than the benchmark of the supervised learning model.Clinical relevance-This work proposed a self-supervised learning model to resolve the few-labelled SD speech data, which can be well used for helping psychiatrists with clinical assistant to diagnosis. With this model, psychiatrists no longer need to spend a lot of time labelling SD speech data.

Original language	English
Title of host publication	2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9798350324471
DOIs	https://doi.org/10.1109/EMBC40787.2023.10340705
Publication status	Published - 2023
Event	45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Sydney, Australia Duration: 24 Jul 2023 → 27 Jul 2023

Publication series

Name	Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
ISSN (Print)	1557-170X

Conference

Conference	45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023
Country/Territory	Australia
City	Sydney
Period	24/07/23 → 27/07/23

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1109/EMBC40787.2023.10340705

Cite this

Bao, Z., Qian, K., Zhao, Z., Sun, M., Huang, R., Xu, D., Hu, B., Yamamoto, Y., & Schuller, B. W. (2023). Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model. In 2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/EMBC40787.2023.10340705

Bao, Zhihao ; Qian, Kun ; Zhao, Zhonghao et al. / Somatisation Disorder Detection via Speech : Introducing a Self-Supervised Learning Model. 2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS).

@inproceedings{7615d45ea00f4de186d443160fa4703d,

title = "Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model",

abstract = "With the depressive psychiatric disorders becoming more common, people are gradually starting to take it seriously. Somatisation disorders, as a general mental disorder, are rarely accurately identified in clinical diagnosis for its specific nature. In the previous work, speech recognition technology has been successfully applied to the task of identifying somatisation disorders on the Shenzhen Somatisation Speech Corpus. Nevertheless, there is still a scarcity of labels for somatisation disorder speech database. The current mainstream approaches in the speech recognition heavily rely on the well labelled data. Compared to supervised learning, self-supervised learning is able to achieve the same or even better recognition results while reducing the reliance on labelled samples. Moreover, self-supervised learning can generate general representations without the need for human hand-crafted features depending on the different recognition tasks. To this end, we apply self-supervised learning pre-trained models to solve few-labelled somatisation disorder speech recognition. In this study, we compare and analyse the results of three self-supervised learning models (contrastive predictive coding, wav2vec and wav2vec 2.0). The best result of wav2vec 2.0 model achieves 77.0 % unweighted average recall and is significantly better than CPC (p <.005), performing better than the benchmark of the supervised learning model.Clinical relevance-This work proposed a self-supervised learning model to resolve the few-labelled SD speech data, which can be well used for helping psychiatrists with clinical assistant to diagnosis. With this model, psychiatrists no longer need to spend a lot of time labelling SD speech data.",

author = "Zhihao Bao and Kun Qian and Zhonghao Zhao and Mengkai Sun and Ruolan Huang and Dewen Xu and Bin Hu and Yoshiharu Yamamoto and Schuller, {Bjorn W.}",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 ; Conference date: 24-07-2023 Through 27-07-2023",

year = "2023",

doi = "10.1109/EMBC40787.2023.10340705",

language = "English",

series = "Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings",

address = "United States",

}

Bao, Z, Qian, K, Zhao, Z, Sun, M, Huang, R, Xu, D, Hu, B, Yamamoto, Y & Schuller, BW 2023, Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model. in 2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, Institute of Electrical and Electronics Engineers Inc., 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023, Sydney, Australia, 24/07/23. https://doi.org/10.1109/EMBC40787.2023.10340705

Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model. / Bao, Zhihao; Qian, Kun; Zhao, Zhonghao et al.
2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Somatisation Disorder Detection via Speech

T2 - 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023

AU - Bao, Zhihao

AU - Qian, Kun

AU - Zhao, Zhonghao

AU - Sun, Mengkai

AU - Huang, Ruolan

AU - Xu, Dewen

AU - Hu, Bin

AU - Yamamoto, Yoshiharu

AU - Schuller, Bjorn W.

PY - 2023

Y1 - 2023

N2 - With the depressive psychiatric disorders becoming more common, people are gradually starting to take it seriously. Somatisation disorders, as a general mental disorder, are rarely accurately identified in clinical diagnosis for its specific nature. In the previous work, speech recognition technology has been successfully applied to the task of identifying somatisation disorders on the Shenzhen Somatisation Speech Corpus. Nevertheless, there is still a scarcity of labels for somatisation disorder speech database. The current mainstream approaches in the speech recognition heavily rely on the well labelled data. Compared to supervised learning, self-supervised learning is able to achieve the same or even better recognition results while reducing the reliance on labelled samples. Moreover, self-supervised learning can generate general representations without the need for human hand-crafted features depending on the different recognition tasks. To this end, we apply self-supervised learning pre-trained models to solve few-labelled somatisation disorder speech recognition. In this study, we compare and analyse the results of three self-supervised learning models (contrastive predictive coding, wav2vec and wav2vec 2.0). The best result of wav2vec 2.0 model achieves 77.0 % unweighted average recall and is significantly better than CPC (p <.005), performing better than the benchmark of the supervised learning model.Clinical relevance-This work proposed a self-supervised learning model to resolve the few-labelled SD speech data, which can be well used for helping psychiatrists with clinical assistant to diagnosis. With this model, psychiatrists no longer need to spend a lot of time labelling SD speech data.

AB - With the depressive psychiatric disorders becoming more common, people are gradually starting to take it seriously. Somatisation disorders, as a general mental disorder, are rarely accurately identified in clinical diagnosis for its specific nature. In the previous work, speech recognition technology has been successfully applied to the task of identifying somatisation disorders on the Shenzhen Somatisation Speech Corpus. Nevertheless, there is still a scarcity of labels for somatisation disorder speech database. The current mainstream approaches in the speech recognition heavily rely on the well labelled data. Compared to supervised learning, self-supervised learning is able to achieve the same or even better recognition results while reducing the reliance on labelled samples. Moreover, self-supervised learning can generate general representations without the need for human hand-crafted features depending on the different recognition tasks. To this end, we apply self-supervised learning pre-trained models to solve few-labelled somatisation disorder speech recognition. In this study, we compare and analyse the results of three self-supervised learning models (contrastive predictive coding, wav2vec and wav2vec 2.0). The best result of wav2vec 2.0 model achieves 77.0 % unweighted average recall and is significantly better than CPC (p <.005), performing better than the benchmark of the supervised learning model.Clinical relevance-This work proposed a self-supervised learning model to resolve the few-labelled SD speech data, which can be well used for helping psychiatrists with clinical assistant to diagnosis. With this model, psychiatrists no longer need to spend a lot of time labelling SD speech data.

UR - http://www.scopus.com/inward/record.url?scp=85179645198&partnerID=8YFLogxK

U2 - 10.1109/EMBC40787.2023.10340705

DO - 10.1109/EMBC40787.2023.10340705

M3 - Conference contribution

C2 - 38082647

AN - SCOPUS:85179645198

T3 - Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS

BT - 2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 24 July 2023 through 27 July 2023

ER -

Bao Z, Qian K, Zhao Z, Sun M, Huang R, Xu D et al. Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model. In 2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2023. (Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS). doi: 10.1109/EMBC40787.2023.10340705

Somatisation Disorder Detection via Speech: Introducing a Self-Supervised Learning Model

Abstract

Publication series

Conference

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this