Abstract
Speech Bandwidth Extension (BWE) aims to improve the quality of speech by reconstructing the missing High Frequency (HF) components using the correlation that exists between the Low Frequency (LF) and HF of speech. The Gaussian Mixture Model (GMM) based methods are widely used. However, the derived mapping function by GMM is a piece-wise linear transformation and ignores the temporal information of speech. Thus, a novel BWE method is proposed for estimation of the HF parts of speech by exploiting Conditional Restricted Boltzmann Machines (CRBM). The proposed method introduces CRBM to obtain time information and model deep non-linear relationships between the spectral envelope features of LF and HF by building high-order eigen spaces between the LF and HF of the speech signal. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.
Original language | English |
---|---|
Pages (from-to) | 370-376 |
Number of pages | 7 |
Journal | Shengxue Xuebao/Acta Acustica |
Volume | 42 |
Issue number | 3 |
Publication status | Published - 1 May 2017 |