Speech bandwidth extension supported by temporal information

Yingxue Wang, Shenghui Zhao*, Jingming Kuang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Speech Bandwidth Extension (BWE) aims to improve the quality of speech by reconstructing the missing High Frequency (HF) components using the correlation that exists between the Low Frequency (LF) and HF of speech. The Gaussian Mixture Model (GMM) based methods are widely used. However, the derived mapping function by GMM is a piece-wise linear transformation and ignores the temporal information of speech. Thus, a novel BWE method is proposed for estimation of the HF parts of speech by exploiting Conditional Restricted Boltzmann Machines (CRBM). The proposed method introduces CRBM to obtain time information and model deep non-linear relationships between the spectral envelope features of LF and HF by building high-order eigen spaces between the LF and HF of the speech signal. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.

Original languageEnglish
Pages (from-to)370-376
Number of pages7
JournalShengxue Xuebao/Acta Acustica
Volume42
Issue number3
Publication statusPublished - 1 May 2017

Fingerprint

Dive into the research topics of 'Speech bandwidth extension supported by temporal information'. Together they form a unique fingerprint.

Cite this