TY - CONF
T1 - Reducing Length Bias in Scoring Neural Machine Translation via a Causal Inference Method
AU - Shi, Xuewen
AU - Huang, Heyan
AU - Jian, Ping
AU - Tang, Yi Kun
N1 - Publisher Copyright:
© 2021 China National Conference on Computational Linguistics Published under Creative Commons Attribution 4.0 International License
PY - 2021
Y1 - 2021
N2 - Neural machine translation (NMT) usually employs beam search to expand the searching space and obtain more translation candidates. However, the increase of the beam size often suffers from plenty of short translations, resulting in dramatical decrease in translation quality. In this paper, we handle the length bias problem through a perspective of causal inference. Specifically, we regard the model generated translation score S as a degraded true translation quality affected by some noise, and one of the confounders is the translation length. We apply a Half-Sibling Regression method to remove the length effect on S, and then we can obtain a debiased translation score without length information. The proposed method is model agnostic and unsupervised, which is adaptive to any NMT model and test dataset. We conduct the experiments on three translation tasks with different scales of datasets. Experimental results and further analyses show that our approaches gain comparable performance with the empirical baseline methods.
AB - Neural machine translation (NMT) usually employs beam search to expand the searching space and obtain more translation candidates. However, the increase of the beam size often suffers from plenty of short translations, resulting in dramatical decrease in translation quality. In this paper, we handle the length bias problem through a perspective of causal inference. Specifically, we regard the model generated translation score S as a degraded true translation quality affected by some noise, and one of the confounders is the translation length. We apply a Half-Sibling Regression method to remove the length effect on S, and then we can obtain a debiased translation score without length information. The proposed method is model agnostic and unsupervised, which is adaptive to any NMT model and test dataset. We conduct the experiments on three translation tasks with different scales of datasets. Experimental results and further analyses show that our approaches gain comparable performance with the empirical baseline methods.
UR - http://www.scopus.com/inward/record.url?scp=85123422485&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85123422485
SP - 874
EP - 885
T2 - 20th Chinese National Conference on Computational Linguistics, CCL 2021
Y2 - 13 August 2021 through 15 August 2021
ER -