TY - GEN
T1 - Single-channel Speech Enhancement Using Multi-Task Learning and Attention Mechanism
AU - Hou, Jingyu
AU - Zhao, Shenghui
AU - An, Yubo
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Major breakthroughs have been made in speech enhancement with the introduction of deep learning. However, the noise reduction performance under the lower signal-to-noise ratio (SNR) conditions and the noise generalization ability of the model are still to be improved. To counter these issues, a multi-task convolutional recurrent network (MT-CRN) is proposed and applied to single-channel speech enhancement. The MT-CRN aims to estimate the magnitude spectrum of both the clean speech and the noise from the noisy speech. Besides, a weighted complementary loss function is constructed to further improve the effectiveness of the multi-task training, and a time-frequency attention mechanism is employed to capture the key information of each task. The experimental results show that the proposed MT-CRN obviously outperforms the baselines at the lower SNR levels with a high parameter efficiency, and achieves a stronger noise generalization performance.
AB - Major breakthroughs have been made in speech enhancement with the introduction of deep learning. However, the noise reduction performance under the lower signal-to-noise ratio (SNR) conditions and the noise generalization ability of the model are still to be improved. To counter these issues, a multi-task convolutional recurrent network (MT-CRN) is proposed and applied to single-channel speech enhancement. The MT-CRN aims to estimate the magnitude spectrum of both the clean speech and the noise from the noisy speech. Besides, a weighted complementary loss function is constructed to further improve the effectiveness of the multi-task training, and a time-frequency attention mechanism is employed to capture the key information of each task. The experimental results show that the proposed MT-CRN obviously outperforms the baselines at the lower SNR levels with a high parameter efficiency, and achieves a stronger noise generalization performance.
KW - Attention mechanism
KW - Convolutional recurrent network
KW - Multi-task learning (MTL)
KW - Speech enhancement
UR - http://www.scopus.com/inward/record.url?scp=85125202944&partnerID=8YFLogxK
U2 - 10.1109/ICSIP52628.2021.9688330
DO - 10.1109/ICSIP52628.2021.9688330
M3 - Conference contribution
AN - SCOPUS:85125202944
T3 - 2021 6th International Conference on Signal and Image Processing, ICSIP 2021
SP - 826
EP - 830
BT - 2021 6th International Conference on Signal and Image Processing, ICSIP 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th International Conference on Signal and Image Processing, ICSIP 2021
Y2 - 22 October 2021 through 24 October 2021
ER -