Single-channel Speech Enhancement Using Multi-Task Learning and Attention Mechanism

Jingyu Hou, Shenghui Zhao*, Yubo An

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Major breakthroughs have been made in speech enhancement with the introduction of deep learning. However, the noise reduction performance under the lower signal-to-noise ratio (SNR) conditions and the noise generalization ability of the model are still to be improved. To counter these issues, a multi-task convolutional recurrent network (MT-CRN) is proposed and applied to single-channel speech enhancement. The MT-CRN aims to estimate the magnitude spectrum of both the clean speech and the noise from the noisy speech. Besides, a weighted complementary loss function is constructed to further improve the effectiveness of the multi-task training, and a time-frequency attention mechanism is employed to capture the key information of each task. The experimental results show that the proposed MT-CRN obviously outperforms the baselines at the lower SNR levels with a high parameter efficiency, and achieves a stronger noise generalization performance.

Original languageEnglish
Title of host publication2021 6th International Conference on Signal and Image Processing, ICSIP 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages826-830
Number of pages5
ISBN (Electronic)9780738133737
DOIs
Publication statusPublished - 2021
Event6th International Conference on Signal and Image Processing, ICSIP 2021 - Nanjing, China
Duration: 22 Oct 202124 Oct 2021

Publication series

Name2021 6th International Conference on Signal and Image Processing, ICSIP 2021

Conference

Conference6th International Conference on Signal and Image Processing, ICSIP 2021
Country/TerritoryChina
CityNanjing
Period22/10/2124/10/21

Keywords

  • Attention mechanism
  • Convolutional recurrent network
  • Multi-task learning (MTL)
  • Speech enhancement

Fingerprint

Dive into the research topics of 'Single-channel Speech Enhancement Using Multi-Task Learning and Attention Mechanism'. Together they form a unique fingerprint.

Cite this