Abstract
Snoring can stem from various upper regions of the upper airways – the excitation location is closely linked to the unique acoustic characteristics of snore sounds, playing a vital role in sleep monitoring. The Munich-Passau Snore Sound Corpus (MPSSC) is the largest database for snoring-based auxiliary diagnosis, offering valuable data for sleep disorder research and diagnosis. However, in MPSSC, there are issues such as the small total number of samples and the uneven sample distribution. Some rare diseases have only a few case samples, failing to meet the need for sufficient learning data. To address these issues, we propose an end-to-end method for high-quality snoring audio generation for data augmentation. This method includes a Rectified-Flow-based 1D-signal diffusion model that enhances data across all classes, combined with an audio-based single diffusion model to enhance rare classes. Under our data augmentation framework, higher specificity, sensitivity, and accuracy are achieved in Automatic Snoring Sound Classification (ASSC). Also, we focus on the explicitness of classification strategies, aiming to prove the enhanced data's high quality and applicability to downstream tasks. Our work provides comprehensive support for ASSC, enhancing sleep disorder diagnosis assistance offering new ideas for database research under scarce medical data conditions.
| Original language | English |
|---|---|
| Article number | 109449 |
| Journal | Biomedical Signal Processing and Control |
| Volume | 116 |
| DOIs | |
| Publication status | Published - 1 May 2026 |
Keywords
- Diffusion model
- Flow matching
- Rectified flow
- Single data diffusion
- Snore sound generation
Fingerprint
Dive into the research topics of 'Breaking through data scarcity: A novel diffusion model approach for snoring sound augmentation and classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver