Enhanced encoder for non-autoregressive machine translation

Shuheng Wang, Shumin Shi*, Heyan Huang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Non-autoregressive machine translation aims to speed up the decoding procedure by discarding the autoregressive model and generating the target words independently. Because non-autoregressive machine translation fails to exploit target-side information, the ability to accurately model source representations is critical. In this paper, we propose an approach to enhance the encoder’s modeling ability by using a pre-trained BERT model as an extra encoder. With a different tokenization method, the BERT encoder and the Raw encoder can model the source input from different aspects. Furthermore, having a gate mechanism, the decoder can dynamically determine which representations contribute to the decoding process. Experimental results on three translation tasks show that our method can significantly improve the performance of non-autoregressive MT, and surpass the baseline non-autoregressive models. On the WMT14 EN→DE translation task, our method achieves 27.87 BLEU with a single decoding step. This is a comparable result with the baseline autoregressive Transformer model which obtains a score of 27.8 BLEU.

Original languageEnglish
Pages (from-to)595-609
Number of pages15
JournalMachine Translation
Volume35
Issue number4
DOIs
Publication statusPublished - Dec 2021

Keywords

  • Machine translation
  • Non-autoregressive
  • Pre-training language model

Fingerprint

Dive into the research topics of 'Enhanced encoder for non-autoregressive machine translation'. Together they form a unique fingerprint.

Cite this