Order adaptation of the fractional Fourier transform using the intraframe pitch change rate for speech recognition

Hui Yin*, Climent Nadeu, Volker Hohmann, Xiang Xie, Jingming Kuang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

We propose an acoustic feature for speech recognition based on the combination of MFCC and fractional Fourier transform (FrFT). The transform orders for FrFT are adaptively set according to the intraframe pitch change rate. This method is motivated by the fact that the speech is not stationary even in a short period of time, and the idea is shown using an AM-FM speech model and some spectrograms of an artificial periodic signal. Experiments were conducted on the intervocalic English consonants provided by Interspeech 2008 Consonant Challenge and a Mandarin connected digits corpus. The performance of the proposed method is compared with the MFCC baseline system. Experimental results show that the proposed features get a slightly better recognition rate than MFCCs presumably because they can better track the dynamic characteristics of the speech harmonics.

Original languageEnglish
Title of host publicationProceedings - 2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008
Pages193-196
Number of pages4
DOIs
Publication statusPublished - 2008
Event2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008 - Kunming, China
Duration: 16 Dec 200819 Dec 2008

Publication series

NameProceedings - 2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008

Conference

Conference2008 6th International Symposium on Chinese Spoken Language Processing, ISCSLP 2008
Country/TerritoryChina
CityKunming
Period16/12/0819/12/08

Keywords

  • Consonant Challenge
  • Feature extraction
  • Fractional Fourier transform
  • Pitch
  • Speech recognition

Fingerprint

Dive into the research topics of 'Order adaptation of the fractional Fourier transform using the intraframe pitch change rate for speech recognition'. Together they form a unique fingerprint.

Cite this