TY - GEN
T1 - Improving neural machine translation by achieving knowledge transfer with sentence alignment learning
AU - Shi, Xuewen
AU - Huang, Heyan
AU - Wang, Wenguan
AU - Jian, Ping
AU - Tang, Yi Kun
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics.
PY - 2019
Y1 - 2019
N2 - Neural Machine Translation (NMT) optimized by Maximum Likelihood Estimation (MLE) lacks the guarantee of translation adequacy. To alleviate this problem, we propose an NMT approach that heightens the adequacy in machine translation by transferring the semantic knowledge learned from bilingual sentence alignment. Specifically, we first design a discriminator that learns to estimate sentence aligning score over translation candidates, and then the learned semantic knowledge is trans-fered to the NMT model under an adversarial learning framework. We also propose a gated self-attention based encoder for sentence embedding. Furthermore, an N-pair training loss is introduced in our framework to aid the discriminator in better capturing lexical evidence in translation candidates. Experimental results show that our proposed method outperforms baseline NMT models on Chinese-to-English and English-to-German translation tasks. Further analysis also indicates the detailed semantic knowledge transfered from the discriminator to the NMT model.
AB - Neural Machine Translation (NMT) optimized by Maximum Likelihood Estimation (MLE) lacks the guarantee of translation adequacy. To alleviate this problem, we propose an NMT approach that heightens the adequacy in machine translation by transferring the semantic knowledge learned from bilingual sentence alignment. Specifically, we first design a discriminator that learns to estimate sentence aligning score over translation candidates, and then the learned semantic knowledge is trans-fered to the NMT model under an adversarial learning framework. We also propose a gated self-attention based encoder for sentence embedding. Furthermore, an N-pair training loss is introduced in our framework to aid the discriminator in better capturing lexical evidence in translation candidates. Experimental results show that our proposed method outperforms baseline NMT models on Chinese-to-English and English-to-German translation tasks. Further analysis also indicates the detailed semantic knowledge transfered from the discriminator to the NMT model.
UR - http://www.scopus.com/inward/record.url?scp=85084337134&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85084337134
T3 - CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference
SP - 260
EP - 270
BT - CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference
PB - Association for Computational Linguistics
T2 - 23rd Conference on Computational Natural Language Learning, CoNLL 2019
Y2 - 3 November 2019 through 4 November 2019
ER -