Pretreatment for speech machine translation

Xiaofei Zhang*, Chong Feng, Heyan Huang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

There are many meaningless modal particles and dittographes in natural spoken language, furthermore ASR (automatic speech recognition) often has some recognition errors and the ASR results have no punctuations. And thus the translation would be rather poor if the ASR results are directly translated by MT (machine translation). Therefore, it is necessary to transform the abnormal ASR results into normative texts to fit machine translation. In this paper, a pretreatment approach which based on conditional random field model was introduced to delete the meaningless modal particles and dittographes, correct the recognition errors, and punctuated the ASR results before machine translation. Experiments show that the MT BLEU of 0.2497 is obtained, that improved by 18.4% over the MT baseline without pretreatment.

源语言英语
主期刊名Computational Collective Intelligence
主期刊副标题Technologies and Applications - Second International Conference, ICCCI 2010, Proceedings
113-121
页数9
版本PART 2
DOI
出版状态已出版 - 2010
活动2nd International Conference on Computational Collective Intelligence - Technologies and Applications, ICCCI 2010 - Kaohsiung, 中国台湾
期限: 10 11月 201012 11月 2010

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
编号PART 2
6422 LNAI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议2nd International Conference on Computational Collective Intelligence - Technologies and Applications, ICCCI 2010
国家/地区中国台湾
Kaohsiung
时期10/11/1012/11/10

指纹

探究 'Pretreatment for speech machine translation' 的科研主题。它们共同构成独一无二的指纹。

引用此