TY - GEN
T1 - Advancing Metadata-Convolutional Neural Networks with Multi-supervised Contrastive Learning and Metadata Insights for Respiratory Sound Analysis
AU - Liu, Miao
AU - Zhang, Haojie
AU - Qian, Kun
AU - Hu, Bin
AU - Nakamura, Toru
AU - Nomura, Taishin
AU - Zhang, Jian
AU - Tang, Zhangguo
AU - Schuller, Björn W.
AU - Yamamoto, Yoshiharu
AU - Li, Huanzhou
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Metadata, as information describing data, encompasses an exhaustive description of various facets of the data. Despite the enhanced accuracy of existing respiratory sound detection methods through multifaceted research, these methods often under-utilise metadata. To explore the potential of metadata, we study its impact on detection performance. We adopt a multi-supervised contrastive learning approach and propose an improved Metadata-Convolutional Neural Network model for more effective extraction of metadata features. We use the International Conference in Biomedical Health Informatics (ICBHI) 2017 database for evaluation and achieve an average score of 59.48% on the official (6:4) split, surpassing current state-of-the-art methods. Moreover, utilising metadata increased the detection rate of respiratory sounds, with gender, a key predictive factor, outperforming other combinations when used with other metadata. Specifically, when combined with age, the average score reached 59.64%.
AB - Metadata, as information describing data, encompasses an exhaustive description of various facets of the data. Despite the enhanced accuracy of existing respiratory sound detection methods through multifaceted research, these methods often under-utilise metadata. To explore the potential of metadata, we study its impact on detection performance. We adopt a multi-supervised contrastive learning approach and propose an improved Metadata-Convolutional Neural Network model for more effective extraction of metadata features. We use the International Conference in Biomedical Health Informatics (ICBHI) 2017 database for evaluation and achieve an average score of 59.48% on the official (6:4) split, surpassing current state-of-the-art methods. Moreover, utilising metadata increased the detection rate of respiratory sounds, with gender, a key predictive factor, outperforming other combinations when used with other metadata. Specifically, when combined with age, the average score reached 59.64%.
KW - Metadata
KW - Multi-supervised contrastive learning
KW - Respiratory sound analysis
UR - https://www.scopus.com/pages/publications/105012242352
U2 - 10.1007/978-981-96-4783-5_3
DO - 10.1007/978-981-96-4783-5_3
M3 - Conference contribution
AN - SCOPUS:105012242352
SN - 9789819647828
T3 - Lecture Notes in Electrical Engineering
SP - 25
EP - 36
BT - Proceedings of the 11th Conference on Sound and Music Technology - Revised Selected Papers from CSMT 2024
A2 - Qian, Kun
A2 - Zhou, Li
A2 - Meng, Qinglin
A2 - Gao, Yongwei
PB - Springer Science and Business Media Deutschland GmbH
T2 - 11th National Conference on Sound and Music Technology, CSMT 2024
Y2 - 11 October 2024 through 13 October 2024
ER -