TY - JOUR
T1 - Exploring the Power of Empirical Mode Decomposition for Sensing the Sound of Silence
T2 - 26th Interspeech Conference 2025
AU - Wu, Chenhao
AU - Cai, Xiangjun
AU - Zhang, Haojie
AU - Jia, Tianrui
AU - Deng, Yilu
AU - Qian, Kun
AU - Schuller, Björn W.
AU - Yamamoto, Yoshiharu
AU - Liu, Jiang
N1 - Publisher Copyright:
© 2025 International Speech Communication Association. All rights reserved.
PY - 2025
Y1 - 2025
N2 - Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder, and mice models have become essential for studying its genetic and behavioural aspects. Ultrasonic Vocalisations (USVs) emitted by mice provide a promising biomarker for ASD detection, but existing methods relying on spectrogram-based features struggle to capture the complex, non-stationary, and multi-scale nature of USVs. To address this, we propose a novel multi-branch fusion model that integrates spectrogram-based features with multi-scale features extracted using Empirical Mode Decomposition (EMD), which decomposes USVs into Intrinsic Mode Functions (IMFs) to represent their inherent complexity better. Through systematic occlusion experiments, we identify high-frequency components, particularly IMF1, as critical for accurate ASD detection, highlighting the diagnostic relevance of high-frequency USV patterns. Our model achieves an Unweighted Average Recall (UAR) of 0.75 in subject-level classification, significantly outperforming existing methods. These findings provide valuable insights into the importance of multi-scale feature extraction and offer a robust framework for improving ASD diagnostics and research.
AB - Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder, and mice models have become essential for studying its genetic and behavioural aspects. Ultrasonic Vocalisations (USVs) emitted by mice provide a promising biomarker for ASD detection, but existing methods relying on spectrogram-based features struggle to capture the complex, non-stationary, and multi-scale nature of USVs. To address this, we propose a novel multi-branch fusion model that integrates spectrogram-based features with multi-scale features extracted using Empirical Mode Decomposition (EMD), which decomposes USVs into Intrinsic Mode Functions (IMFs) to represent their inherent complexity better. Through systematic occlusion experiments, we identify high-frequency components, particularly IMF1, as critical for accurate ASD detection, highlighting the diagnostic relevance of high-frequency USV patterns. Our model achieves an Unweighted Average Recall (UAR) of 0.75 in subject-level classification, significantly outperforming existing methods. These findings provide valuable insights into the importance of multi-scale feature extraction and offer a robust framework for improving ASD diagnostics and research.
KW - autism spectrum disorder
KW - empirical mode decomposition
KW - multi-branch fusion model
KW - ultrasound vocalisations
UR - https://www.scopus.com/pages/publications/105020041914
U2 - 10.21437/Interspeech.2025-571
DO - 10.21437/Interspeech.2025-571
M3 - Conference article
AN - SCOPUS:105020041914
SN - 2308-457X
SP - 1708
EP - 1712
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Y2 - 17 August 2025 through 21 August 2025
ER -