Mandarin connected digits recognition for whispered speech

Tingting Ru*, Xiang Xie, Hui Yin, Jingming Kuang

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

In this paper, the acoustic characteristics and recognition of whispered speech are discussed. A Mandarin digits database is built both in normal speech and whispered speech. The collected speech materials of normal and whispered speech are analyzed to verify the characteristics and differences for the two kinds of speech. Cross recognition is carried out using normal and whispered speech as training data and testing data respectively, and the detailed recognition results are analyzed by using the confusion matrices. The results show that it's not suitable to recognize whispered speech using models trained by normal speech, and the word correct rate of the whispered speech is in close relation with its acoustic characteristics. Some possible solutions are also suggested.

Original languageEnglish
Pages (from-to)1141-1144
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2008
EventINTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia
Duration: 22 Sept 200826 Sept 2008

Keywords

  • Confusion matrix
  • Connected digits
  • Speech recognition
  • Whispered speech

Fingerprint

Dive into the research topics of 'Mandarin connected digits recognition for whispered speech'. Together they form a unique fingerprint.

Cite this