Output-based speech quality assessment using autoencoder and support vector regression

Jing Wang*, Yahui Shan, Xiang Xie, Jingming Kuang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

The output-based speech quality assessment method has been widely used and received increasing attention since it does not need undistorted signals as reference. In order to obtain a high correlation between the predicted scores and subjective results, this paper presents a new speech quality assessment method to estimate the quality of degraded speech without the reference speech. Bottleneck features are extracted with autoencoder and support vector regression is chosen as mapping model from objective representation to subjective scores. Experiments are conducted in a dataset containing various degraded speech signals and subjective listening scores. The proposed method takes advantage of autoencoder in forming a good representation of its input which can be better mapped to Mean Opinion Score. The experimental results show that compared with the standardization ITU-T P.563 and another deep learning-based assessment method, the proposed method brings about a higher correlation coefficient between predicted scores and subjective scores.

Original languageEnglish
Pages (from-to)13-20
Number of pages8
JournalSpeech Communication
Volume110
DOIs
Publication statusPublished - Jul 2019

Keywords

  • Bottleneck feature
  • Speech quality assessment
  • Support vector regression

Fingerprint

Dive into the research topics of 'Output-based speech quality assessment using autoencoder and support vector regression'. Together they form a unique fingerprint.

Cite this