A Spectral-change-aware Loss Function for DNN-based Speech Separation

Xiang Li, Xihong Wu, Jing Chen

科研成果: 书/报告/会议事项章节会议稿件同行评审

6 引用 (Scopus)

摘要

Speech separation can be treated as a mask estimation problem where supervised learning is employed to construct the mapping from acoustic features to a mask. Interference can be reduced by applying the estimated mask on a time-frequency (T-F) representation of noisy speech, resulting in improved speech intelligibility. Most of existing learning networks for speech separation aim to minimize the Mean Square Error (MSE) over the training set, where the loss from each T-F representation is equally weighted. In this paper, we proposed a spectral-change-aware loss function, where loss from the T-F units with large spectral changes over time were assigned higher weights compared to the T-F units with minor spectral changes. Such spectral-change-aware loss function was evaluated on speech separation performance in terms of mask estimation accuracy, short-time objective intelligibility (STOI) and SNR gain of unvoiced segments. The results indicated that the proposed loss function could further improve the speech intelligibility and increase SNR gain of unvoiced segments even in the cost of increased error rate of estimated mask.

源语言英语
主期刊名2019 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
6870-6874
页数5
ISBN(电子版)9781479981311
DOI
出版状态已出版 - 5月 2019
已对外发布
活动44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019 - Brighton, 英国
期限: 12 5月 201917 5月 2019

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2019-May
ISSN(印刷版)1520-6149

会议

会议44th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2019
国家/地区英国
Brighton
时期12/05/1917/05/19

指纹

探究 'A Spectral-change-aware Loss Function for DNN-based Speech Separation' 的科研主题。它们共同构成独一无二的指纹。

引用此