An improved tone labeling and prediction method with non-uniform segmentation of F0 contour

Xingyu Na*, Xiang Xie, Jingming Kuang, Yaling He

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper proposes a tone labeling technique for tonal language speech synthesis. Non-uniform segmentation using Viterbi alignment is introduced to determine the boundaries to get F0 symbols, which are used as tonal label to eliminate the mismatch between tone patterns and F0 contours of training data. During context clustering, the tendency of adjacent F0 state distributions are captured by the state-based phonetic trees. Means of tone model states are directly quantized to get full tonal label in the synthesis stage. Both objective and subjective experiment results show that the proposed technique can improve the perceptual prosody of synthetic speech of non-professional speakers.

Original languageEnglish
Title of host publication2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012
Pages252-255
Number of pages4
DOIs
Publication statusPublished - 2012
Event2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012 - Hong Kong, China
Duration: 5 Dec 20128 Dec 2012

Publication series

Name2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012

Conference

Conference2012 8th International Symposium on Chinese Spoken Language Processing, ISCSLP 2012
Country/TerritoryChina
CityHong Kong
Period5/12/128/12/12

Keywords

  • F0 generation
  • F0 modeling
  • Statistical speech synthesis
  • Tone labeling

Fingerprint

Dive into the research topics of 'An improved tone labeling and prediction method with non-uniform segmentation of F0 contour'. Together they form a unique fingerprint.

Cite this