Exploring interpretable representations for heart sound abnormality detection

Zhihua Wang; Kun Qian; Houguang Liu; Bin Hu; Björn W. Schuller; Yoshiharu Yamamoto

doi:10.1016/j.bspc.2023.104569

Exploring interpretable representations for heart sound abnormality detection

Zhihua Wang, Kun Qian^*, Houguang Liu, Bin Hu, Björn W. Schuller, Yoshiharu Yamamoto

^*Corresponding author for this work

School of Medical and Technology

Research output: Contribution to journal › Article › peer-review

24 Citations (Scopus)

Abstract

The advantages of non-invasive, real-time and convenient, computer audition-based heart sound abnormality detection methods have increasingly attracted efforts among the community of cardiovascular diseases. Time–frequency analyses are crucial for computer audition-based applications. However, a comprehensive investigation on discovering an optimised way for extracting time–frequency representations from heart sounds is lacking until now. To this end, we propose a comprehensive investigation on time–frequency methods for analysing the heart sound, i.e., short-time Fourier transformation, Log-Mel transformation, Hilbert–Huang transformation, wavelet transformation, Mel transformation, and Stockwell transformation. The time–frequency representations are automatically learnt via pre-trained deep convolutional neural networks. Considering the urgent need of smart stethoscopes for high robust detection algorithms in real environment, the training, verification, and testing sets employed in the extensive evaluation are subject-independent. Finally, to further understand the heart sound-based digital phenotype for cardiovascular diseases, explainable artificial intelligence approaches are used to reveal the reasons for the performance differences of four time–frequency representations in heart sound abnormality detection. Experimental results show that Stockwell transformation can beat other methods by reaching the highest overall score of 65.2%. The interpretable results demonstrate that Stockwell transformation does not only present more information for heart sounds, but also provides a certain noise robustness. Besides, the considered fine-tuned deep model brings an improvement of the mean accuracy over the previous state-of-the-art results by 9.0% in subject-independent testing.

Original language	English
Article number	104569
Journal	Biomedical Signal Processing and Control
Volume	82
DOIs	https://doi.org/10.1016/j.bspc.2023.104569
Publication status	Published - Apr 2023

Keywords

Computer audition
Digital phenotype
Explainable AI
Heart sound
mHealth

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1016/j.bspc.2023.104569

Cite this

Wang, Z., Qian, K., Liu, H., Hu, B., Schuller, B. W., & Yamamoto, Y. (2023). Exploring interpretable representations for heart sound abnormality detection. Biomedical Signal Processing and Control, 82, Article 104569. https://doi.org/10.1016/j.bspc.2023.104569

@article{05a3f7ea8d4f4d78a666ee0827c96f78,

title = "Exploring interpretable representations for heart sound abnormality detection",

abstract = "The advantages of non-invasive, real-time and convenient, computer audition-based heart sound abnormality detection methods have increasingly attracted efforts among the community of cardiovascular diseases. Time–frequency analyses are crucial for computer audition-based applications. However, a comprehensive investigation on discovering an optimised way for extracting time–frequency representations from heart sounds is lacking until now. To this end, we propose a comprehensive investigation on time–frequency methods for analysing the heart sound, i.e., short-time Fourier transformation, Log-Mel transformation, Hilbert–Huang transformation, wavelet transformation, Mel transformation, and Stockwell transformation. The time–frequency representations are automatically learnt via pre-trained deep convolutional neural networks. Considering the urgent need of smart stethoscopes for high robust detection algorithms in real environment, the training, verification, and testing sets employed in the extensive evaluation are subject-independent. Finally, to further understand the heart sound-based digital phenotype for cardiovascular diseases, explainable artificial intelligence approaches are used to reveal the reasons for the performance differences of four time–frequency representations in heart sound abnormality detection. Experimental results show that Stockwell transformation can beat other methods by reaching the highest overall score of 65.2\%. The interpretable results demonstrate that Stockwell transformation does not only present more information for heart sounds, but also provides a certain noise robustness. Besides, the considered fine-tuned deep model brings an improvement of the mean accuracy over the previous state-of-the-art results by 9.0\% in subject-independent testing.",

keywords = "Computer audition, Digital phenotype, Explainable AI, Heart sound, mHealth",

author = "Zhihua Wang and Kun Qian and Houguang Liu and Bin Hu and Schuller, \{Bj{\"o}rn W.\} and Yoshiharu Yamamoto",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Ltd",

year = "2023",

month = apr,

doi = "10.1016/j.bspc.2023.104569",

language = "English",

volume = "82",

journal = "Biomedical Signal Processing and Control",

issn = "1746-8094",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Exploring interpretable representations for heart sound abnormality detection

AU - Wang, Zhihua

AU - Qian, Kun

AU - Liu, Houguang

AU - Hu, Bin

AU - Schuller, Björn W.

AU - Yamamoto, Yoshiharu

PY - 2023/4

Y1 - 2023/4

N2 - The advantages of non-invasive, real-time and convenient, computer audition-based heart sound abnormality detection methods have increasingly attracted efforts among the community of cardiovascular diseases. Time–frequency analyses are crucial for computer audition-based applications. However, a comprehensive investigation on discovering an optimised way for extracting time–frequency representations from heart sounds is lacking until now. To this end, we propose a comprehensive investigation on time–frequency methods for analysing the heart sound, i.e., short-time Fourier transformation, Log-Mel transformation, Hilbert–Huang transformation, wavelet transformation, Mel transformation, and Stockwell transformation. The time–frequency representations are automatically learnt via pre-trained deep convolutional neural networks. Considering the urgent need of smart stethoscopes for high robust detection algorithms in real environment, the training, verification, and testing sets employed in the extensive evaluation are subject-independent. Finally, to further understand the heart sound-based digital phenotype for cardiovascular diseases, explainable artificial intelligence approaches are used to reveal the reasons for the performance differences of four time–frequency representations in heart sound abnormality detection. Experimental results show that Stockwell transformation can beat other methods by reaching the highest overall score of 65.2%. The interpretable results demonstrate that Stockwell transformation does not only present more information for heart sounds, but also provides a certain noise robustness. Besides, the considered fine-tuned deep model brings an improvement of the mean accuracy over the previous state-of-the-art results by 9.0% in subject-independent testing.

AB - The advantages of non-invasive, real-time and convenient, computer audition-based heart sound abnormality detection methods have increasingly attracted efforts among the community of cardiovascular diseases. Time–frequency analyses are crucial for computer audition-based applications. However, a comprehensive investigation on discovering an optimised way for extracting time–frequency representations from heart sounds is lacking until now. To this end, we propose a comprehensive investigation on time–frequency methods for analysing the heart sound, i.e., short-time Fourier transformation, Log-Mel transformation, Hilbert–Huang transformation, wavelet transformation, Mel transformation, and Stockwell transformation. The time–frequency representations are automatically learnt via pre-trained deep convolutional neural networks. Considering the urgent need of smart stethoscopes for high robust detection algorithms in real environment, the training, verification, and testing sets employed in the extensive evaluation are subject-independent. Finally, to further understand the heart sound-based digital phenotype for cardiovascular diseases, explainable artificial intelligence approaches are used to reveal the reasons for the performance differences of four time–frequency representations in heart sound abnormality detection. Experimental results show that Stockwell transformation can beat other methods by reaching the highest overall score of 65.2%. The interpretable results demonstrate that Stockwell transformation does not only present more information for heart sounds, but also provides a certain noise robustness. Besides, the considered fine-tuned deep model brings an improvement of the mean accuracy over the previous state-of-the-art results by 9.0% in subject-independent testing.

KW - Computer audition

KW - Digital phenotype

KW - Explainable AI

KW - Heart sound

KW - mHealth

UR - http://www.scopus.com/inward/record.url?scp=85146440053&partnerID=8YFLogxK

U2 - 10.1016/j.bspc.2023.104569

DO - 10.1016/j.bspc.2023.104569

M3 - Article

AN - SCOPUS:85146440053

SN - 1746-8094

VL - 82

JO - Biomedical Signal Processing and Control

JF - Biomedical Signal Processing and Control

M1 - 104569

ER -

Exploring interpretable representations for heart sound abnormality detection

Abstract

Keywords

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this