Using conditional restricted Boltzmann machines for spectral envelope modeling in speech bandwidth extension

Yingxue Wang, Shenghui Zhao, Dan Qu, Jingming Kuang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Citations (Scopus)

Abstract

In this paper, we present a conditional restricted Boltzmann machine (CRBM) based speech bandwidth extension (BWE) method. A CRBM is employed to obtain time information and model deep non-linear relationships between the spectral envelope features of low frequency (LF) and high frequency (HF). Two exclusive CRBMs are adopted to model the distribution of LF's and HF's spectral envelope features. respectively. A neural network (NN) is then used to model the joint distribution of hidden variables extracted from the two CRBMs. The proposed method takes advantage of the strong ability of CRBM in discovering the temporal correlation between adjacent frames and modeling deep non-linear relationships between input and output. Both the objective and subjective evaluations indicate that our proposed method outperforms the conventional Gaussian mixture model based methods and other NN based methods.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5930-5934
Number of pages5
ISBN (Electronic)9781479999880
DOIs
Publication statusPublished - 18 May 2016
Event41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
Duration: 20 Mar 201625 Mar 2016

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2016-May
ISSN (Print)1520-6149

Conference

Conference41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Country/TerritoryChina
CityShanghai
Period20/03/1625/03/16

Keywords

  • Gaussian mixture model
  • Speech bandwidth expansion
  • conditional restricted Boltzmann machine

Fingerprint

Dive into the research topics of 'Using conditional restricted Boltzmann machines for spectral envelope modeling in speech bandwidth extension'. Together they form a unique fingerprint.

Cite this