Robust speech recognition combining cepstral and articulatory features

Zhuan Ling Zha, Jin Hu, Qing Ran Zhan, Ya Hui Shan, Xiang Xie, Jing Wang, Hao Bo Cheng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

In this paper, a nonlinear relationship between pronunciation and auditory perception is introduced into speech recognition, and superior robustness is shown in the results. The Extreme Learning Machine mapping the relations was trained with Mocha-TIMIT database. Articulatory Features (AFs) were obtained by the network and MFCCs were fused for training acoustic model-DNN-HMM and GMM-HMM in this experiment. It has an 117.0% relative increment of WER with MFCCs-AFs-GMM-HMM while 125.6% with MFCCs-GMM-HMM And the performance of the model DNN-HMM is better than that of the model GMM-HMM, both with relative and absolute performance.

Original languageEnglish
Title of host publication2017 3rd IEEE International Conference on Computer and Communications, ICCC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1401-1405
Number of pages5
ISBN (Electronic)9781509063505
DOIs
Publication statusPublished - 2 Jul 2017
Event3rd IEEE International Conference on Computer and Communications, ICCC 2017 - Chengdu, China
Duration: 13 Dec 201716 Dec 2017

Publication series

Name2017 3rd IEEE International Conference on Computer and Communications, ICCC 2017
Volume2018-January

Conference

Conference3rd IEEE International Conference on Computer and Communications, ICCC 2017
Country/TerritoryChina
CityChengdu
Period13/12/1716/12/17

Keywords

  • DNN
  • ELM
  • articulatory features
  • robustness
  • speech recognition

Fingerprint

Dive into the research topics of 'Robust speech recognition combining cepstral and articulatory features'. Together they form a unique fingerprint.

Cite this