TY - JOUR
T1 - Daily Mental Health Monitoring from Speech
T2 - 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
AU - Song, Meishu
AU - Triantafyllopoulos, Andreas
AU - Yang, Zijiang
AU - Takeuchi, Hiroki
AU - Nakamura, Toru
AU - Kishi, Akifumi
AU - Ishizawa, Tetsuro
AU - Yoshiuchi, Kazuhiro
AU - Jing, Xin
AU - Karas, Vincent
AU - Zhao, Zhonghao
AU - Qian, Kun
AU - Hu, Bin
AU - Schuller, Bjorn W.
AU - Yamamoto, Yoshiharu
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Translating mental health recognition from clinical research into real-world application requires extensive data, yet existing emotion datasets are impoverished in terms of daily mental health monitoring, especially when aiming for self-reported anxiety and depression recognition. We introduce the Japanese Daily Speech Dataset (JDSD), a large in-the-wild daily speech emotion dataset consisting of 20,827 speech samples from 342 speakers and 54 hours of total duration. The data is annotated on the Depression and Anxiety Mood Scale (DAMS) - 9 self-reported emotions to evaluate mood state including "vigorous", "gloomy", "concerned", "happy", "unpleasant", "anxious", "cheerful", "depressed", and "worried". Our dataset possesses emotional states, activity, and time diversity, making it useful for training models to track daily emotional states for healthcare purposes. We partition our corpus and provide a multi-task benchmark across nine emotions, demonstrating that mental health states can be predicted reliably from self-reports with a Concordance Correlation Coefficient value of.547 on average. We hope that JDSD will become a valuable resource to further the development of daily emotional healthcare tracking.
AB - Translating mental health recognition from clinical research into real-world application requires extensive data, yet existing emotion datasets are impoverished in terms of daily mental health monitoring, especially when aiming for self-reported anxiety and depression recognition. We introduce the Japanese Daily Speech Dataset (JDSD), a large in-the-wild daily speech emotion dataset consisting of 20,827 speech samples from 342 speakers and 54 hours of total duration. The data is annotated on the Depression and Anxiety Mood Scale (DAMS) - 9 self-reported emotions to evaluate mood state including "vigorous", "gloomy", "concerned", "happy", "unpleasant", "anxious", "cheerful", "depressed", and "worried". Our dataset possesses emotional states, activity, and time diversity, making it useful for training models to track daily emotional states for healthcare purposes. We partition our corpus and provide a multi-task benchmark across nine emotions, demonstrating that mental health states can be predicted reliably from self-reports with a Concordance Correlation Coefficient value of.547 on average. We hope that JDSD will become a valuable resource to further the development of daily emotional healthcare tracking.
KW - Daily Speech
KW - Mental Health
KW - Multitask Learning
KW - Speech Emotion Recognition
UR - http://www.scopus.com/inward/record.url?scp=85176230553&partnerID=8YFLogxK
U2 - 10.1109/ICASSP49357.2023.10096884
DO - 10.1109/ICASSP49357.2023.10096884
M3 - Conference article
AN - SCOPUS:85176230553
SN - 0736-7791
JO - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
JF - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
Y2 - 4 June 2023 through 10 June 2023
ER -