PFGM++ Combined with Stochastic Regeneration for Speech Enhancement

Xiao Cao, Shenghui Zhao*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Diffusion models have been applied in speech enhancement due to its capability to learn complex data distributions. However, the extended Poisson flow generative model (PFGM++) outperforms the diffusion models in terms of robustness. In this work, we introduce PFGM++ to speech enhancement, and SR-PFGM++, which samples using ordinary differential equation (ODE), is proposed by combining the stochastic regeneration model (StoRM) with PFGM++. The testing results on the VoiceBank-DEMAND dataset show that SR-PFGM++ achieves a higher performance with fewer sampling steps compared with StoRM. We also performed a mismatch test on the TIMIT+NOISE92 dataset and the results show the strong generalization capability of SR-PFGM++.

Original languageEnglish
Title of host publication2024 9th International Conference on Signal and Image Processing, ICSIP 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages267-271
Number of pages5
ISBN (Electronic)9798350350920
DOIs
Publication statusPublished - 2024
Event9th International Conference on Signal and Image Processing, ICSIP 2024 - Hybrid, Nanjing, China
Duration: 12 Jul 202414 Jul 2024

Publication series

Name2024 9th International Conference on Signal and Image Processing, ICSIP 2024

Conference

Conference9th International Conference on Signal and Image Processing, ICSIP 2024
Country/TerritoryChina
CityHybrid, Nanjing
Period12/07/2414/07/24

Keywords

  • PFGM++
  • score-based generative model
  • speech enhancement
  • stochastic regeneration

Fingerprint

Dive into the research topics of 'PFGM++ Combined with Stochastic Regeneration for Speech Enhancement'. Together they form a unique fingerprint.

Cite this