跳到主要导航 跳到搜索 跳到主要内容

SR-PFGM++ Based Consistency Model for Speech Enhancement

  • Beijing Institute of Technology
  • Ltd.

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Diffusion models and the extended Poisson flow generative model (PFGM++) have been applied to speech enhancement. They are sampled via stochastic differential equation (SDE) or ordinary differential equation (ODE), but usually require a large number of sampling steps. Hence, we introduce the consistency models, which allow for high-quality one-step generation with non-adversarial training. Specifically, based on our previous work, SR-PFGM++ (PFGM++ combined with stochastic regeneration) is distilled to train consistency model, resulting in the proposed Consistency Model for speech enhancement. Test results on the VoiceBank-DEMAND dataset show that the proposed model significantly reduces the inference time relative to SR-PFGM++ while maintaining comparable performance. Besides, mismatch test results on the TIMIT+NOISE92 dataset demonstrate the generalization ability of the proposed model.

源语言英语
主期刊名IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
出版商Institute of Electrical and Electronics Engineers Inc.
ISBN(电子版)9798331515669
DOI
出版状态已出版 - 2024
活动2nd IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024 - Zhuhai, 中国
期限: 22 11月 202424 11月 2024

出版系列

姓名IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024

会议

会议2nd IEEE International Conference on Signal, Information and Data Processing, ICSIDP 2024
国家/地区中国
Zhuhai
时期22/11/2424/11/24

指纹

探究 'SR-PFGM++ Based Consistency Model for Speech Enhancement' 的科研主题。它们共同构成独一无二的指纹。

引用此