Similarity-aware neural machine translation: reducing human translator efforts by leveraging high-potential sentences with translation memory

Tianfu Zhang; Heyan Huang; Chong Feng; Xiaochi Wei

doi:10.1007/s00521-020-04939-y

Similarity-aware neural machine translation: reducing human translator efforts by leveraging high-potential sentences with translation memory

Tianfu Zhang, Heyan Huang, Chong Feng^*, Xiaochi Wei

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

6 引用（Scopus）

摘要

In computer-aided translation tasks, reducing the time of reviewing and post-editing on translations is meaningful for human translators. However, existing studies mainly aim to improve overall translation quality, which only reduces post-editing time. In this work, we firstly identify testing sentences which are highly similar to training set (high-potential sentences) to reduce reviewing time, then we focus on improving corresponding translation quality greatly to reduce post-editing time. From this point, we firstly propose two novel translation memory methods to characterize similarity between sentences on syntactic and template dimensions separately. Based on that, we propose a similarity-aware neural machine translation (similarity-NMT) which consists of two independent modules: (1) Identification Module, which can identify high-potential sentences of testing set according to multi-dimensional similarity information; (2) Translation Module, which can integrate multi-dimensional similarity information of parallel training sentence pairs into an attention-based NMT model by leveraging posterior regularization. Experiments on two Chinese ⇒ English domains have well-validated the effectiveness and universality of the proposed method of reducing human translator efforts.

源语言	英语
页（从-至）	17623-17635
页数	13
期刊	Neural Computing and Applications
卷	32
期	23
DOI	https://doi.org/10.1007/s00521-020-04939-y
出版状态	已出版 - 12月 2020

访问文件

10.1007/s00521-020-04939-y

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{aa32d13707a04775aac2ec6c8e687871,

title = "Similarity-aware neural machine translation: reducing human translator efforts by leveraging high-potential sentences with translation memory",

abstract = "In computer-aided translation tasks, reducing the time of reviewing and post-editing on translations is meaningful for human translators. However, existing studies mainly aim to improve overall translation quality, which only reduces post-editing time. In this work, we firstly identify testing sentences which are highly similar to training set (high-potential sentences) to reduce reviewing time, then we focus on improving corresponding translation quality greatly to reduce post-editing time. From this point, we firstly propose two novel translation memory methods to characterize similarity between sentences on syntactic and template dimensions separately. Based on that, we propose a similarity-aware neural machine translation (similarity-NMT) which consists of two independent modules: (1) Identification Module, which can identify high-potential sentences of testing set according to multi-dimensional similarity information; (2) Translation Module, which can integrate multi-dimensional similarity information of parallel training sentence pairs into an attention-based NMT model by leveraging posterior regularization. Experiments on two Chinese ⇒ English domains have well-validated the effectiveness and universality of the proposed method of reducing human translator efforts.",

keywords = "High-potential sentences, Human translator efforts, Neural machine translation, Translation memory",

author = "Tianfu Zhang and Heyan Huang and Chong Feng and Xiaochi Wei",

note = "Publisher Copyright: {\textcopyright} 2020, Springer-Verlag London Ltd., part of Springer Nature.",

year = "2020",

month = dec,

doi = "10.1007/s00521-020-04939-y",

language = "English",

volume = "32",

pages = "17623--17635",

journal = "Neural Computing and Applications",

issn = "0941-0643",

publisher = "Springer London",

number = "23",

}

TY - JOUR

T1 - Similarity-aware neural machine translation

T2 - reducing human translator efforts by leveraging high-potential sentences with translation memory

AU - Zhang, Tianfu

AU - Huang, Heyan

AU - Feng, Chong

AU - Wei, Xiaochi

PY - 2020/12

Y1 - 2020/12

N2 - In computer-aided translation tasks, reducing the time of reviewing and post-editing on translations is meaningful for human translators. However, existing studies mainly aim to improve overall translation quality, which only reduces post-editing time. In this work, we firstly identify testing sentences which are highly similar to training set (high-potential sentences) to reduce reviewing time, then we focus on improving corresponding translation quality greatly to reduce post-editing time. From this point, we firstly propose two novel translation memory methods to characterize similarity between sentences on syntactic and template dimensions separately. Based on that, we propose a similarity-aware neural machine translation (similarity-NMT) which consists of two independent modules: (1) Identification Module, which can identify high-potential sentences of testing set according to multi-dimensional similarity information; (2) Translation Module, which can integrate multi-dimensional similarity information of parallel training sentence pairs into an attention-based NMT model by leveraging posterior regularization. Experiments on two Chinese ⇒ English domains have well-validated the effectiveness and universality of the proposed method of reducing human translator efforts.

AB - In computer-aided translation tasks, reducing the time of reviewing and post-editing on translations is meaningful for human translators. However, existing studies mainly aim to improve overall translation quality, which only reduces post-editing time. In this work, we firstly identify testing sentences which are highly similar to training set (high-potential sentences) to reduce reviewing time, then we focus on improving corresponding translation quality greatly to reduce post-editing time. From this point, we firstly propose two novel translation memory methods to characterize similarity between sentences on syntactic and template dimensions separately. Based on that, we propose a similarity-aware neural machine translation (similarity-NMT) which consists of two independent modules: (1) Identification Module, which can identify high-potential sentences of testing set according to multi-dimensional similarity information; (2) Translation Module, which can integrate multi-dimensional similarity information of parallel training sentence pairs into an attention-based NMT model by leveraging posterior regularization. Experiments on two Chinese ⇒ English domains have well-validated the effectiveness and universality of the proposed method of reducing human translator efforts.

KW - High-potential sentences

KW - Human translator efforts

KW - Neural machine translation

KW - Translation memory

UR - http://www.scopus.com/inward/record.url?scp=85084360589&partnerID=8YFLogxK

U2 - 10.1007/s00521-020-04939-y

DO - 10.1007/s00521-020-04939-y

M3 - Article

AN - SCOPUS:85084360589

SN - 0941-0643

VL - 32

SP - 17623

EP - 17635

JO - Neural Computing and Applications

JF - Neural Computing and Applications

IS - 23

ER -

Similarity-aware neural machine translation: reducing human translator efforts by leveraging high-potential sentences with translation memory

摘要

访问文件

其它文件与链接

指纹

引用此