Variable rate characteristic waveform interpolation speech coder based on phonetic classification

Jing Wang; Jing Ming Kuang; Sheng Hui Zhao

Variable rate characteristic waveform interpolation speech coder based on phonetic classification

Jing Wang^*, Jing Ming Kuang, Sheng Hui Zhao

^*此作品的通讯作者

信息与电子学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

A variable-bit-rate characteristic waveform interpolation (VBR-CWI) speech codec with about 1.8 kbit/s average bit rate which integrates phonetic classification into characteristic waveform (CW) decomposition is proposed. Each input frame is classified into one of 4 phonetic classes. Non-speech frames are represented with Bark-band noise model. The extracted CWs become rapidly evolving waveforms (REWs) or slowly evolving waveforms (SEWs) in the cases of unvoiced or stationary voiced frames respectively, while mixed voiced frames use the same CW decomposition as that in the conventional CWI. Experimental results show that the proposed codec can eliminate most buzzy and noisy artifacts existing in the fixed-bit-rate characteristic waveform interpolation (FBR-CWI) speech codec, the average bit rate can be much lower, and its reconstructed speech quality is much better than FS 1016 CELP at 4.8 kbit/s and similar to G. 723.1 ACELP at 5.3 kbit/s.

源语言	英语
页（从-至）	187-192
页数	6
期刊	Journal of Beijing Institute of Technology (English Edition)
卷	16
期	2
出版状态	已出版 - 6月 2007

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{cc7ded65be0e4796a8892a941e221145,

title = "Variable rate characteristic waveform interpolation speech coder based on phonetic classification",

abstract = "A variable-bit-rate characteristic waveform interpolation (VBR-CWI) speech codec with about 1.8 kbit/s average bit rate which integrates phonetic classification into characteristic waveform (CW) decomposition is proposed. Each input frame is classified into one of 4 phonetic classes. Non-speech frames are represented with Bark-band noise model. The extracted CWs become rapidly evolving waveforms (REWs) or slowly evolving waveforms (SEWs) in the cases of unvoiced or stationary voiced frames respectively, while mixed voiced frames use the same CW decomposition as that in the conventional CWI. Experimental results show that the proposed codec can eliminate most buzzy and noisy artifacts existing in the fixed-bit-rate characteristic waveform interpolation (FBR-CWI) speech codec, the average bit rate can be much lower, and its reconstructed speech quality is much better than FS 1016 CELP at 4.8 kbit/s and similar to G. 723.1 ACELP at 5.3 kbit/s.",

keywords = "Characteristic waveform interpolation, Phonetic classification, Variable bit rate speech coding",

author = "Jing Wang and Kuang, {Jing Ming} and Zhao, {Sheng Hui}",

year = "2007",

month = jun,

language = "English",

volume = "16",

pages = "187--192",

journal = "Journal of Beijing Institute of Technology (English Edition)",

issn = "1004-0579",

publisher = "Beijing Institute of Technology",

number = "2",

}

TY - JOUR

T1 - Variable rate characteristic waveform interpolation speech coder based on phonetic classification

AU - Wang, Jing

AU - Kuang, Jing Ming

AU - Zhao, Sheng Hui

PY - 2007/6

Y1 - 2007/6

N2 - A variable-bit-rate characteristic waveform interpolation (VBR-CWI) speech codec with about 1.8 kbit/s average bit rate which integrates phonetic classification into characteristic waveform (CW) decomposition is proposed. Each input frame is classified into one of 4 phonetic classes. Non-speech frames are represented with Bark-band noise model. The extracted CWs become rapidly evolving waveforms (REWs) or slowly evolving waveforms (SEWs) in the cases of unvoiced or stationary voiced frames respectively, while mixed voiced frames use the same CW decomposition as that in the conventional CWI. Experimental results show that the proposed codec can eliminate most buzzy and noisy artifacts existing in the fixed-bit-rate characteristic waveform interpolation (FBR-CWI) speech codec, the average bit rate can be much lower, and its reconstructed speech quality is much better than FS 1016 CELP at 4.8 kbit/s and similar to G. 723.1 ACELP at 5.3 kbit/s.

AB - A variable-bit-rate characteristic waveform interpolation (VBR-CWI) speech codec with about 1.8 kbit/s average bit rate which integrates phonetic classification into characteristic waveform (CW) decomposition is proposed. Each input frame is classified into one of 4 phonetic classes. Non-speech frames are represented with Bark-band noise model. The extracted CWs become rapidly evolving waveforms (REWs) or slowly evolving waveforms (SEWs) in the cases of unvoiced or stationary voiced frames respectively, while mixed voiced frames use the same CW decomposition as that in the conventional CWI. Experimental results show that the proposed codec can eliminate most buzzy and noisy artifacts existing in the fixed-bit-rate characteristic waveform interpolation (FBR-CWI) speech codec, the average bit rate can be much lower, and its reconstructed speech quality is much better than FS 1016 CELP at 4.8 kbit/s and similar to G. 723.1 ACELP at 5.3 kbit/s.

KW - Characteristic waveform interpolation

KW - Phonetic classification

KW - Variable bit rate speech coding

UR - http://www.scopus.com/inward/record.url?scp=34447345113&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:34447345113

SN - 1004-0579

VL - 16

SP - 187

EP - 192

JO - Journal of Beijing Institute of Technology (English Edition)

JF - Journal of Beijing Institute of Technology (English Edition)

IS - 2

ER -

Variable rate characteristic waveform interpolation speech coder based on phonetic classification

摘要

其它文件与链接

指纹

引用此