Correcting translation for non-autoregressive transformer

Shuheng Wang; Heyan Huang; Shumin Shi; Dongbai Li; Dongen Guo

doi:10.1016/j.asoc.2024.112488

Correcting translation for non-autoregressive transformer

Shuheng Wang, Heyan Huang, Shumin Shi^*, Dongbai Li, Dongen Guo

^*Corresponding author for this work

School of Computer Science and Technology

Nanyang Institute of Technology

Research output: Contribution to journal › Article › peer-review

Abstract

Non-Autoregressive Transformer has shown great success in recent years. It generally employs the encoder–decoder framework, where the encoder maps the sentence into hidden representation, and the decoder generates the target tokens simultaneously. Since the theory of non-autoregressive transformer is consistent with the architecture of the encoder, we suppose that it is somewhat wasteful for the encoder to only map input sentence into hidden representation. In this study, we proposed a novel non-autoregressive transformer to fully exploit the capabilities of the encoder. Specifically, in our approach, the encoder not only encodes the input sentence into hidden representation, but also generates the target tokens. Consequently, the decoder is relieved of its responsibility to generate the target tokens, instead of focusing on correcting the sentence produced by the encoder. We evaluate the performance of the proposed non-autoregressive transformer on three widely-used translation tasks. The experimental results illustrate the proposed method can significantly improve the performance of the non-autoregressive transformer, which achieved 27.94 BLEU on WMT14 EN → DE task, 33.96 BLEU on WMT16 EN → RO task, and 33.85 BLEU on IWSLT14 DE → EN.

Original language	English
Article number	112488
Journal	Applied Soft Computing
Volume	168
DOIs	https://doi.org/10.1016/j.asoc.2024.112488
Publication status	Published - Jan 2025

Keywords

Correction
Encoder
Non-autoregressive

Access to Document

10.1016/j.asoc.2024.112488

Cite this

Wang, S., Huang, H., Shi, S., Li, D., & Guo, D. (2025). Correcting translation for non-autoregressive transformer. Applied Soft Computing, 168, Article 112488. https://doi.org/10.1016/j.asoc.2024.112488

@article{ac379733d96740bc850515078663e5fd,

title = "Correcting translation for non-autoregressive transformer",

abstract = "Non-Autoregressive Transformer has shown great success in recent years. It generally employs the encoder–decoder framework, where the encoder maps the sentence into hidden representation, and the decoder generates the target tokens simultaneously. Since the theory of non-autoregressive transformer is consistent with the architecture of the encoder, we suppose that it is somewhat wasteful for the encoder to only map input sentence into hidden representation. In this study, we proposed a novel non-autoregressive transformer to fully exploit the capabilities of the encoder. Specifically, in our approach, the encoder not only encodes the input sentence into hidden representation, but also generates the target tokens. Consequently, the decoder is relieved of its responsibility to generate the target tokens, instead of focusing on correcting the sentence produced by the encoder. We evaluate the performance of the proposed non-autoregressive transformer on three widely-used translation tasks. The experimental results illustrate the proposed method can significantly improve the performance of the non-autoregressive transformer, which achieved 27.94 BLEU on WMT14 EN → DE task, 33.96 BLEU on WMT16 EN → RO task, and 33.85 BLEU on IWSLT14 DE → EN.",

keywords = "Correction, Encoder, Non-autoregressive",

author = "Shuheng Wang and Heyan Huang and Shumin Shi and Dongbai Li and Dongen Guo",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier B.V.",

year = "2025",

month = jan,

doi = "10.1016/j.asoc.2024.112488",

language = "English",

volume = "168",

journal = "Applied Soft Computing",

issn = "1568-4946",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - Correcting translation for non-autoregressive transformer

AU - Wang, Shuheng

AU - Huang, Heyan

AU - Shi, Shumin

AU - Li, Dongbai

AU - Guo, Dongen

PY - 2025/1

Y1 - 2025/1

N2 - Non-Autoregressive Transformer has shown great success in recent years. It generally employs the encoder–decoder framework, where the encoder maps the sentence into hidden representation, and the decoder generates the target tokens simultaneously. Since the theory of non-autoregressive transformer is consistent with the architecture of the encoder, we suppose that it is somewhat wasteful for the encoder to only map input sentence into hidden representation. In this study, we proposed a novel non-autoregressive transformer to fully exploit the capabilities of the encoder. Specifically, in our approach, the encoder not only encodes the input sentence into hidden representation, but also generates the target tokens. Consequently, the decoder is relieved of its responsibility to generate the target tokens, instead of focusing on correcting the sentence produced by the encoder. We evaluate the performance of the proposed non-autoregressive transformer on three widely-used translation tasks. The experimental results illustrate the proposed method can significantly improve the performance of the non-autoregressive transformer, which achieved 27.94 BLEU on WMT14 EN → DE task, 33.96 BLEU on WMT16 EN → RO task, and 33.85 BLEU on IWSLT14 DE → EN.

AB - Non-Autoregressive Transformer has shown great success in recent years. It generally employs the encoder–decoder framework, where the encoder maps the sentence into hidden representation, and the decoder generates the target tokens simultaneously. Since the theory of non-autoregressive transformer is consistent with the architecture of the encoder, we suppose that it is somewhat wasteful for the encoder to only map input sentence into hidden representation. In this study, we proposed a novel non-autoregressive transformer to fully exploit the capabilities of the encoder. Specifically, in our approach, the encoder not only encodes the input sentence into hidden representation, but also generates the target tokens. Consequently, the decoder is relieved of its responsibility to generate the target tokens, instead of focusing on correcting the sentence produced by the encoder. We evaluate the performance of the proposed non-autoregressive transformer on three widely-used translation tasks. The experimental results illustrate the proposed method can significantly improve the performance of the non-autoregressive transformer, which achieved 27.94 BLEU on WMT14 EN → DE task, 33.96 BLEU on WMT16 EN → RO task, and 33.85 BLEU on IWSLT14 DE → EN.

KW - Correction

KW - Encoder

KW - Non-autoregressive

UR - http://www.scopus.com/inward/record.url?scp=85210313378&partnerID=8YFLogxK

U2 - 10.1016/j.asoc.2024.112488

DO - 10.1016/j.asoc.2024.112488

M3 - Article

AN - SCOPUS:85210313378

SN - 1568-4946

VL - 168

JO - Applied Soft Computing

JF - Applied Soft Computing

M1 - 112488

ER -

Correcting translation for non-autoregressive transformer

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this