Abstract
Speech Bandwidth Extension (BWE) is a technique that attempts to improve the speech quality by recovering the missing High Frequency (HF) components using the correlation that exists between the Low Frequency (LF) and HF parts of the wide-band speech signal. The Gaussian Mixture Model (GMM) based methods are widely used, but it recovers the missing HF components on the assumption that the LF and HF parts obey a Gaussian distribution and gives their linear relationship, leading to the distortion of reconstructed speech. This Study proposes a new speech BWE method, which uses two Gaussian-Bernoulli Restricted Boltzmann Machines (GBRBMs) to extract the high-order statistical characteristics of spectral envelopes of the LF and HF respectively. Then, high-order features of the LF are mapped to those of the HF using a Feedforward Neural Network (FNN). The proposed method learns deep relationship between the spectral envelopes of LF and HF and can model the distribution of spectral envelopes more precisely by extracting the high-order statistical characteristics of the LF components and the HF components. The objective and subjective test results show that the proposed method outperforms the conventional GMM based method.
Original language | English |
---|---|
Pages (from-to) | 1717-1723 |
Number of pages | 7 |
Journal | Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology |
Volume | 38 |
Issue number | 7 |
DOIs | |
Publication status | Published - 1 Jul 2016 |
Keywords
- Feedforward Neural Networks (FNN)
- Gaussian mixture model
- Restricted Boltzmann machines
- Speech bandwidth extension