Speech bandwidth extension based on restricted boltzmann machines

Yingxue Wang, Shenghui Zhao*, Yingying Yu, Jingming Kuang

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

4 引用 (Scopus)

摘要

Speech Bandwidth Extension (BWE) is a technique that attempts to improve the speech quality by recovering the missing High Frequency (HF) components using the correlation that exists between the Low Frequency (LF) and HF parts of the wide-band speech signal. The Gaussian Mixture Model (GMM) based methods are widely used, but it recovers the missing HF components on the assumption that the LF and HF parts obey a Gaussian distribution and gives their linear relationship, leading to the distortion of reconstructed speech. This Study proposes a new speech BWE method, which uses two Gaussian-Bernoulli Restricted Boltzmann Machines (GBRBMs) to extract the high-order statistical characteristics of spectral envelopes of the LF and HF respectively. Then, high-order features of the LF are mapped to those of the HF using a Feedforward Neural Network (FNN). The proposed method learns deep relationship between the spectral envelopes of LF and HF and can model the distribution of spectral envelopes more precisely by extracting the high-order statistical characteristics of the LF components and the HF components. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.

源语言英语
页(从-至)1717-1723
页数7
期刊Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology
38
7
DOI
出版状态已出版 - 1 7月 2016

指纹

探究 'Speech bandwidth extension based on restricted boltzmann machines' 的科研主题。它们共同构成独一无二的指纹。

引用此