ASR normalization for machine translation

Heyan Huang*, Chong Feng, Jiande Wang, Xiaofei Zhang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

In natural spoken language there are many meaningless modal particles and dittographes, furthermore ASR (automatic speech recognition) often has some recognition errors and the ASR results have no punctuations. Therefore, the translation would be rather poor if the ASR results are directly translated by MT (machine translation). In this paper, an ASR normalization approach was introduced for machine translation which based on maximum entropy sequential labeling model. Before translation, the meaningless modal particles and dittograph were deleted, and the recognition errors were corrected, and ASR results were also punctuated. Experiments show that the MT BLEU of 0.2465 is obtained, that improved by 17.3% over the MT baseline without normalization. The positive experimental results confirm that ASR normalization is effective for improvement of translation quality for spoken language machine translation.

源语言英语
主期刊名Proceedings - 2010 2nd International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2010
91-94
页数4
DOI
出版状态已出版 - 2010
活动2010 2nd International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2010 - Nanjing, 中国
期限: 26 8月 201028 8月 2010

出版系列

姓名Proceedings - 2010 2nd International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2010
2

会议

会议2010 2nd International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2010
国家/地区中国
Nanjing
时期26/08/1028/08/10

指纹

探究 'ASR normalization for machine translation' 的科研主题。它们共同构成独一无二的指纹。

引用此