Using conditional restricted Boltzmann machines for spectral envelope modeling in speech bandwidth extension

Yingxue Wang, Shenghui Zhao, Dan Qu, Jingming Kuang

科研成果: 书/报告/会议事项章节会议稿件同行评审

14 引用 (Scopus)

摘要

In this paper, we present a conditional restricted Boltzmann machine (CRBM) based speech bandwidth extension (BWE) method. A CRBM is employed to obtain time information and model deep non-linear relationships between the spectral envelope features of low frequency (LF) and high frequency (HF). Two exclusive CRBMs are adopted to model the distribution of LF's and HF's spectral envelope features. respectively. A neural network (NN) is then used to model the joint distribution of hidden variables extracted from the two CRBMs. The proposed method takes advantage of the strong ability of CRBM in discovering the temporal correlation between adjacent frames and modeling deep non-linear relationships between input and output. Both the objective and subjective evaluations indicate that our proposed method outperforms the conventional Gaussian mixture model based methods and other NN based methods.

源语言英语
主期刊名2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
5930-5934
页数5
ISBN(电子版)9781479999880
DOI
出版状态已出版 - 18 5月 2016
活动41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, 中国
期限: 20 3月 201625 3月 2016

出版系列

姓名ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
2016-May
ISSN(印刷版)1520-6149

会议

会议41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
国家/地区中国
Shanghai
时期20/03/1625/03/16

指纹

探究 'Using conditional restricted Boltzmann machines for spectral envelope modeling in speech bandwidth extension' 的科研主题。它们共同构成独一无二的指纹。

引用此