Speech Bandwidth Extension Based on Codebook Mapping and GMM

Ying Xue Wang; Ying Ying Yu; Sheng Hui Zhao; Jing Ming Kuang

doi:10.15918/j.tbit1001-0645.2017.09.016

Speech Bandwidth Extension Based on Codebook Mapping and GMM

Ying Xue Wang, Ying Ying Yu, Sheng Hui Zhao^*, Jing Ming Kuang

^*Corresponding author for this work

School of Information and Electronics

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

Abstract

Speech bandwidth extension (BWE) based on the conventional Gaussian mixture model (GMM) often suffers from the overly smoothed problem, and the main reason is the low accuracy of the estimated covariance which results in the loss of specific high frequency feature. Thus, a speech bandwidth extension base on codebook mapping (CM) and GMM was proposed in this paper. Firstly, the feature of low frequency (LF) and high frequency (HF) were extracted, and the GMM model was trained. Then, an offset vector codebook was designed based on the trained GMM parameters. In the reconstruction phase, LF offset vectors were transformed to HF offset vectors according to the trained offset vector codebook. The final HF feature parameter was obtained by adding the HF offset vectors to the estimated part by GMM. It is shown by subjective evaluations and objective evaluations that the CM-GMM significantly overcomes the overly smoothed problem and obviously improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method.

Original language	English
Pages (from-to)	970-974
Number of pages	5
Journal	Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
Volume	37
Issue number	9
DOIs	https://doi.org/10.15918/j.tbit1001-0645.2017.09.016
Publication status	Published - 1 Sept 2017

Keywords

Codebook mapping
Gaussian mixture model (GMM)
Speech bandwidth extension

Access to Document

10.15918/j.tbit1001-0645.2017.09.016

Cite this

Wang, Y. X., Yu, Y. Y., Zhao, S. H., & Kuang, J. M. (2017). Speech Bandwidth Extension Based on Codebook Mapping and GMM. Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 37(9), 970-974. https://doi.org/10.15918/j.tbit1001-0645.2017.09.016

@article{89e1bbdd27aa47a28cfb7caa62487e9a,

title = "Speech Bandwidth Extension Based on Codebook Mapping and GMM",

abstract = "Speech bandwidth extension (BWE) based on the conventional Gaussian mixture model (GMM) often suffers from the overly smoothed problem, and the main reason is the low accuracy of the estimated covariance which results in the loss of specific high frequency feature. Thus, a speech bandwidth extension base on codebook mapping (CM) and GMM was proposed in this paper. Firstly, the feature of low frequency (LF) and high frequency (HF) were extracted, and the GMM model was trained. Then, an offset vector codebook was designed based on the trained GMM parameters. In the reconstruction phase, LF offset vectors were transformed to HF offset vectors according to the trained offset vector codebook. The final HF feature parameter was obtained by adding the HF offset vectors to the estimated part by GMM. It is shown by subjective evaluations and objective evaluations that the CM-GMM significantly overcomes the overly smoothed problem and obviously improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method.",

keywords = "Codebook mapping, Gaussian mixture model (GMM), Speech bandwidth extension",

author = "Wang, {Ying Xue} and Yu, {Ying Ying} and Zhao, {Sheng Hui} and Kuang, {Jing Ming}",

year = "2017",

month = sep,

day = "1",

doi = "10.15918/j.tbit1001-0645.2017.09.016",

language = "English",

volume = "37",

pages = "970--974",

journal = "Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology",

issn = "1001-0645",

publisher = "Beijing Institute of Technology",

number = "9",

}

TY - JOUR

T1 - Speech Bandwidth Extension Based on Codebook Mapping and GMM

AU - Wang, Ying Xue

AU - Yu, Ying Ying

AU - Zhao, Sheng Hui

AU - Kuang, Jing Ming

PY - 2017/9/1

Y1 - 2017/9/1

N2 - Speech bandwidth extension (BWE) based on the conventional Gaussian mixture model (GMM) often suffers from the overly smoothed problem, and the main reason is the low accuracy of the estimated covariance which results in the loss of specific high frequency feature. Thus, a speech bandwidth extension base on codebook mapping (CM) and GMM was proposed in this paper. Firstly, the feature of low frequency (LF) and high frequency (HF) were extracted, and the GMM model was trained. Then, an offset vector codebook was designed based on the trained GMM parameters. In the reconstruction phase, LF offset vectors were transformed to HF offset vectors according to the trained offset vector codebook. The final HF feature parameter was obtained by adding the HF offset vectors to the estimated part by GMM. It is shown by subjective evaluations and objective evaluations that the CM-GMM significantly overcomes the overly smoothed problem and obviously improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method.

AB - Speech bandwidth extension (BWE) based on the conventional Gaussian mixture model (GMM) often suffers from the overly smoothed problem, and the main reason is the low accuracy of the estimated covariance which results in the loss of specific high frequency feature. Thus, a speech bandwidth extension base on codebook mapping (CM) and GMM was proposed in this paper. Firstly, the feature of low frequency (LF) and high frequency (HF) were extracted, and the GMM model was trained. Then, an offset vector codebook was designed based on the trained GMM parameters. In the reconstruction phase, LF offset vectors were transformed to HF offset vectors according to the trained offset vector codebook. The final HF feature parameter was obtained by adding the HF offset vectors to the estimated part by GMM. It is shown by subjective evaluations and objective evaluations that the CM-GMM significantly overcomes the overly smoothed problem and obviously improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method.

KW - Codebook mapping

KW - Gaussian mixture model (GMM)

KW - Speech bandwidth extension

UR - http://www.scopus.com/inward/record.url?scp=85032463941&partnerID=8YFLogxK

U2 - 10.15918/j.tbit1001-0645.2017.09.016

DO - 10.15918/j.tbit1001-0645.2017.09.016

M3 - Article

AN - SCOPUS:85032463941

SN - 1001-0645

VL - 37

SP - 970

EP - 974

JO - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology

JF - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology

IS - 9

ER -

Speech Bandwidth Extension Based on Codebook Mapping and GMM

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this