Training set similarity based parameter selection for statistical machine translation

Xuewen Shi, Heyan Huang, Ping Jian*, Yi Kun Tang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Log-linear model based statistical machine translation systems (SMT) are usually composed of multiple feature functions. Each feature function is assigned a weight as a model parameter. In this paper, we consider that different input source sentences may have discrepant needs for model parameters. To adapt the model to different inputs, we propose a model parameters selection method for log-linear model based SMT systems. The method is mainly based on the characteristics of different feature functions themselves without any assumption on unseen test sets. Experimental results on two language pairs (Zh-En and Ug-Zh) show that our method leads to the improvements up to 2.4 and 2.2 BLEU score respectively, and it also shows the good interpretability of our proposed method.

源语言英语
主期刊名Web and Big Data - Second International Joint Conference, APWeb-WAIM 2018, Proceedings
编辑Jianliang Xu, Yoshiharu Ishikawa, Yi Cai
出版商Springer Verlag
63-71
页数9
ISBN(印刷版)9783319968896
DOI
出版状态已出版 - 2018
活动2nd Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2018 - Macau, 中国
期限: 23 7月 201825 7月 2018

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
10987 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议2nd Asia Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2018
国家/地区中国
Macau
时期23/07/1825/07/18

指纹

探究 'Training set similarity based parameter selection for statistical machine translation' 的科研主题。它们共同构成独一无二的指纹。

引用此