Speech bandwidth extension supported by temporal information

Yingxue Wang, Shenghui Zhao*, Jingming Kuang

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Speech Bandwidth Extension (BWE) aims to improve the quality of speech by reconstructing the missing High Frequency (HF) components using the correlation that exists between the Low Frequency (LF) and HF of speech. The Gaussian Mixture Model (GMM) based methods are widely used. However, the derived mapping function by GMM is a piece-wise linear transformation and ignores the temporal information of speech. Thus, a novel BWE method is proposed for estimation of the HF parts of speech by exploiting Conditional Restricted Boltzmann Machines (CRBM). The proposed method introduces CRBM to obtain time information and model deep non-linear relationships between the spectral envelope features of LF and HF by building high-order eigen spaces between the LF and HF of the speech signal. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.

源语言英语
页(从-至)370-376
页数7
期刊Shengxue Xuebao/Acta Acustica
42
3
出版状态已出版 - 1 5月 2017

指纹

探究 'Speech bandwidth extension supported by temporal information' 的科研主题。它们共同构成独一无二的指纹。

引用此