Improving non-autoregressive machine translation via autoregressive training

Shuheng Wang; Shumin Shi; Heyan Huang; Wei Zhang

doi:10.1088/1742-6596/2031/1/012045

Improving non-autoregressive machine translation via autoregressive training

Shuheng Wang, Shumin Shi^*, Heyan Huang, Wei Zhang

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Conference article › peer-review

Abstract

In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.

Original language	English
Article number	012045
Journal	Journal of Physics: Conference Series
Volume	2031
Issue number	1
DOIs	https://doi.org/10.1088/1742-6596/2031/1/012045
Publication status	Published - 30 Sept 2021
Event	2021 2nd International Conference on Signal Processing and Computer Science, SPCS 2021 - Qingdao, Virtual, China Duration: 20 Aug 2021 → 22 Aug 2021

Access to Document

10.1088/1742-6596/2031/1/012045

Cite this

Wang, S., Shi, S., Huang, H., & Zhang, W. (2021). Improving non-autoregressive machine translation via autoregressive training. Journal of Physics: Conference Series, 2031(1), Article 012045. https://doi.org/10.1088/1742-6596/2031/1/012045

@article{a99818c2b99d41d4a32e98389d461e2a,

title = "Improving non-autoregressive machine translation via autoregressive training",

abstract = "In recent years, non-autoregressive machine translation has attracted many researchers{\textquoteright} attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.",

author = "Shuheng Wang and Shumin Shi and Heyan Huang and Wei Zhang",

note = "Publisher Copyright: {\textcopyright} 2021 Institute of Physics Publishing. All rights reserved.; 2021 2nd International Conference on Signal Processing and Computer Science, SPCS 2021 ; Conference date: 20-08-2021 Through 22-08-2021",

year = "2021",

month = sep,

day = "30",

doi = "10.1088/1742-6596/2031/1/012045",

language = "English",

volume = "2031",

journal = "Journal of Physics: Conference Series",

issn = "1742-6588",

publisher = "IOP Publishing Ltd.",

number = "1",

}

TY - JOUR

T1 - Improving non-autoregressive machine translation via autoregressive training

AU - Wang, Shuheng

AU - Shi, Shumin

AU - Huang, Heyan

AU - Zhang, Wei

PY - 2021/9/30

Y1 - 2021/9/30

N2 - In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.

AB - In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.

UR - http://www.scopus.com/inward/record.url?scp=85117591080&partnerID=8YFLogxK

U2 - 10.1088/1742-6596/2031/1/012045

DO - 10.1088/1742-6596/2031/1/012045

M3 - Conference article

AN - SCOPUS:85117591080

SN - 1742-6588

VL - 2031

JO - Journal of Physics: Conference Series

JF - Journal of Physics: Conference Series

IS - 1

M1 - 012045

T2 - 2021 2nd International Conference on Signal Processing and Computer Science, SPCS 2021

Y2 - 20 August 2021 through 22 August 2021

ER -

Improving non-autoregressive machine translation via autoregressive training

Abstract

Access to Document

Other files and links

Fingerprint

Cite this