Improving non-autoregressive machine translation via autoregressive training

Shuheng Wang, Shumin Shi*, Heyan Huang, Wei Zhang

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.

Original languageEnglish
Article number012045
JournalJournal of Physics: Conference Series
Volume2031
Issue number1
DOIs
Publication statusPublished - 30 Sept 2021
Event2021 2nd International Conference on Signal Processing and Computer Science, SPCS 2021 - Qingdao, Virtual, China
Duration: 20 Aug 202122 Aug 2021

Fingerprint

Dive into the research topics of 'Improving non-autoregressive machine translation via autoregressive training'. Together they form a unique fingerprint.

Cite this

Wang, S., Shi, S., Huang, H., & Zhang, W. (2021). Improving non-autoregressive machine translation via autoregressive training. Journal of Physics: Conference Series, 2031(1), Article 012045. https://doi.org/10.1088/1742-6596/2031/1/012045