Residual Unet with Attention Mechanism for Time-Frequency Domain Speech Enhancement

Hanyu Chen, Xiwei Peng, Qiqi Jiang, Yujie Guo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Eliminating the negative effects of background environmental noise is an interesting and challenging task in audio processing. In recent years, denoising technology based on neural networks (NN) has achieved good performance. In particular, the structure based on the convolutional encoder and decoder has been proven to achieve good enhancement effects. On this basis, this paper proposes a residual unet structure combined with the attention mechanism. Effectively reduce the impact of gradient disappearance on network training, and improve the semantic gap between encoder output and decoder output due to unet shortcut connections. The experimental results show that compared with the DNN baseline and unet network, the enhanced voice quality has been significantly improved.

Original languageEnglish
Title of host publicationProceedings of the 41st Chinese Control Conference, CCC 2022
EditorsZhijun Li, Jian Sun
PublisherIEEE Computer Society
Pages7007-7011
Number of pages5
ISBN (Electronic)9789887581536
DOIs
Publication statusPublished - 2022
Event41st Chinese Control Conference, CCC 2022 - Hefei, China
Duration: 25 Jul 202227 Jul 2022

Publication series

NameChinese Control Conference, CCC
Volume2022-July
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference41st Chinese Control Conference, CCC 2022
Country/TerritoryChina
CityHefei
Period25/07/2227/07/22

Keywords

  • Speech enhancement
  • Unet
  • attention gating
  • residual unit

Fingerprint

Dive into the research topics of 'Residual Unet with Attention Mechanism for Time-Frequency Domain Speech Enhancement'. Together they form a unique fingerprint.

Cite this