Improving non-autoregressive machine translation via autoregressive training

Shuheng Wang; Shumin Shi; Heyan Huang; Wei Zhang

doi:10.1088/1742-6596/2031/1/012045

Improving non-autoregressive machine translation via autoregressive training

Shuheng Wang, Shumin Shi^*, Heyan Huang, Wei Zhang

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 会议文章 › 同行评审

摘要

In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.

源语言	英语
文章编号	012045
期刊	Journal of Physics: Conference Series
卷	2031
期	1
DOI	https://doi.org/10.1088/1742-6596/2031/1/012045
出版状态	已出版 - 30 9月 2021
活动	2021 2nd International Conference on Signal Processing and Computer Science, SPCS 2021 - Qingdao, Virtual, 中国期限: 20 8月 2021 → 22 8月 2021

访问文件

10.1088/1742-6596/2031/1/012045

其它文件与链接

链接到 Scopus 的出版物

引用此

Wang, S., Shi, S., Huang, H., & Zhang, W. (2021). Improving non-autoregressive machine translation via autoregressive training. Journal of Physics: Conference Series, 2031(1), 文章 012045. https://doi.org/10.1088/1742-6596/2031/1/012045

@article{a99818c2b99d41d4a32e98389d461e2a,

title = "Improving non-autoregressive machine translation via autoregressive training",

abstract = "In recent years, non-autoregressive machine translation has attracted many researchers{\textquoteright} attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.",

author = "Shuheng Wang and Shumin Shi and Heyan Huang and Wei Zhang",

note = "Publisher Copyright: {\textcopyright} 2021 Institute of Physics Publishing. All rights reserved.; 2021 2nd International Conference on Signal Processing and Computer Science, SPCS 2021 ; Conference date: 20-08-2021 Through 22-08-2021",

year = "2021",

month = sep,

day = "30",

doi = "10.1088/1742-6596/2031/1/012045",

language = "English",

volume = "2031",

journal = "Journal of Physics: Conference Series",

issn = "1742-6588",

publisher = "IOP Publishing Ltd.",

number = "1",

}

TY - JOUR

T1 - Improving non-autoregressive machine translation via autoregressive training

AU - Wang, Shuheng

AU - Shi, Shumin

AU - Huang, Heyan

AU - Zhang, Wei

PY - 2021/9/30

Y1 - 2021/9/30

N2 - In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.

AB - In recent years, non-autoregressive machine translation has attracted many researchers’ attentions. Non-autoregressive translation (NAT) achieves faster decoding speed at the cost of translation accuracy compared with autoregressive translation (AT). Since NAT and AT models have similar architecture, a natural idea is to use AT task assisting NAT task. Previous works use curriculum learning or distillation to improve the performance of NAT model. However, they are complex to follow and diffucult to be integrated into some new works. So in this paper, to make it easy, we introduce a multi-task framework to improve the performance of NAT task. Specially, we use a fully shared encoder-decoder network to train NAT task and AT task simultaneously. To evaluate the performance of our model, we conduct experiments on serval benchmask tasks, including WMT14 EN-DE, WMT16 EN-RO and IWSLT14 DE-EN. The experimental results demonstrate that our model achieves improvements but still keeps simple.

UR - http://www.scopus.com/inward/record.url?scp=85117591080&partnerID=8YFLogxK

U2 - 10.1088/1742-6596/2031/1/012045

DO - 10.1088/1742-6596/2031/1/012045

M3 - Conference article

AN - SCOPUS:85117591080

SN - 1742-6588

VL - 2031

JO - Journal of Physics: Conference Series

JF - Journal of Physics: Conference Series

IS - 1

M1 - 012045

T2 - 2021 2nd International Conference on Signal Processing and Computer Science, SPCS 2021

Y2 - 20 August 2021 through 22 August 2021

ER -

Improving non-autoregressive machine translation via autoregressive training

摘要

访问文件

其它文件与链接

指纹

引用此