TY - GEN
T1 - Learning image-based representations for heart sound classification
AU - Ren, Zhao
AU - Cummins, Nicholas
AU - Pandit, Vedhas
AU - Han, Jing
AU - Qian, Kun
AU - Schuller, Björn
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/4/23
Y1 - 2018/4/23
N2 - Machine learning based heart sound classification represents an efficient technology that can help reduce the burden of manual auscultation through the automatic detection of abnormal heart sounds. In this regard, we investigate the efficacy of using the pre-trained Convolutional Neural Networks (CNNs) from large-scale image data for the classification of Phonocardiogram (PCG) signals by learning deep PCG representations. First, the PCG files are segmented into chunks of equal length. Then, we extract a scalogram image from each chunk using a wavelet transformation. Next, the scalogram images are fed into either a pre-trained CNN, or the same network fine-tuned on heart sound data. Deep representations are then extracted from a fully connected layer of each network and classification is achieved by a static classifier. Alternatively, the scalogram images are fed into an end-to-end CNN formed by adapting a pre-trained network via transfer learning. Key results indicate that our deep PCG representations extracted from a fine-tuned CNN perform the strongest, 56.2 % mean accuracy, on our heart sound classification task. When compared to a baseline accuracy of 46.9 %, gained using conventional audio processing features and a support vector machine, this is a significant relative improvement of 19.8 % (p < .001 by one-tailed z-test).
AB - Machine learning based heart sound classification represents an efficient technology that can help reduce the burden of manual auscultation through the automatic detection of abnormal heart sounds. In this regard, we investigate the efficacy of using the pre-trained Convolutional Neural Networks (CNNs) from large-scale image data for the classification of Phonocardiogram (PCG) signals by learning deep PCG representations. First, the PCG files are segmented into chunks of equal length. Then, we extract a scalogram image from each chunk using a wavelet transformation. Next, the scalogram images are fed into either a pre-trained CNN, or the same network fine-tuned on heart sound data. Deep representations are then extracted from a fully connected layer of each network and classification is achieved by a static classifier. Alternatively, the scalogram images are fed into an end-to-end CNN formed by adapting a pre-trained network via transfer learning. Key results indicate that our deep PCG representations extracted from a fine-tuned CNN perform the strongest, 56.2 % mean accuracy, on our heart sound classification task. When compared to a baseline accuracy of 46.9 %, gained using conventional audio processing features and a support vector machine, this is a significant relative improvement of 19.8 % (p < .001 by one-tailed z-test).
KW - Convolutional neural networks
KW - Heart sound classification
KW - Phonocardiogram
KW - Scalogram
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85047210220&partnerID=8YFLogxK
U2 - 10.1145/3194658.3194671
DO - 10.1145/3194658.3194671
M3 - Conference contribution
AN - SCOPUS:85047210220
T3 - ACM International Conference Proceeding Series
SP - 143
EP - 147
BT - DH 2018 - Proceedings of the 2018 International Conference on Digital Health
PB - Association for Computing Machinery
T2 - 8th International Conference on Digital Health, DH 2018
Y2 - 23 April 2018 through 26 April 2018
ER -