Tone generation by maximizing joint likelihood of syllabic HMMs for mandarin speech synthesis

Xing Yu Na, Chao Min Wang, Xiang Xie, Jing Ming Kuang, Ya Ling He

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A tone generation method by maximizing the joint likelihood of syllabic HMMs is proposed to improve the Mandarin speech synthesis. F0 sequence is generated by jointly maximizing the likelihood of the state-level F0 model and syllable-level tone model under the constraint of mean F0 of the adjacent units. The optimal weight of the tone component is searched in terms of the parameter generation error and correlation coefficients. Objective and subjective evaluations both prove the positive effects of this method. The generation error is reduced by 26.7%, the correlation coefficient is increased by 6.5%, and the prosody perception is significantly improved.

Original languageEnglish
Title of host publicationProceedings of the 6th International Conference on Speech Prosody, SP 2012
PublisherTongji University Press
Pages23-26
Number of pages4
ISBN (Print)9787560848693
Publication statusPublished - 2012
Event6th International Conference on Speech Prosody 2012, SP 2012 - Shanghai, China
Duration: 22 May 201225 May 2012

Publication series

NameProceedings of the 6th International Conference on Speech Prosody, SP 2012
Volume1

Conference

Conference6th International Conference on Speech Prosody 2012, SP 2012
Country/TerritoryChina
CityShanghai
Period22/05/1225/05/12

Keywords

  • F0 contour
  • Speech prosody
  • Speech synthesis
  • Tone generation

Fingerprint

Dive into the research topics of 'Tone generation by maximizing joint likelihood of syllabic HMMs for mandarin speech synthesis'. Together they form a unique fingerprint.

Cite this