TY - JOUR
T1 - Similarity-aware neural machine translation
T2 - reducing human translator efforts by leveraging high-potential sentences with translation memory
AU - Zhang, Tianfu
AU - Huang, Heyan
AU - Feng, Chong
AU - Wei, Xiaochi
N1 - Publisher Copyright:
© 2020, Springer-Verlag London Ltd., part of Springer Nature.
PY - 2020/12
Y1 - 2020/12
N2 - In computer-aided translation tasks, reducing the time of reviewing and post-editing on translations is meaningful for human translators. However, existing studies mainly aim to improve overall translation quality, which only reduces post-editing time. In this work, we firstly identify testing sentences which are highly similar to training set (high-potential sentences) to reduce reviewing time, then we focus on improving corresponding translation quality greatly to reduce post-editing time. From this point, we firstly propose two novel translation memory methods to characterize similarity between sentences on syntactic and template dimensions separately. Based on that, we propose a similarity-aware neural machine translation (similarity-NMT) which consists of two independent modules: (1) Identification Module, which can identify high-potential sentences of testing set according to multi-dimensional similarity information; (2) Translation Module, which can integrate multi-dimensional similarity information of parallel training sentence pairs into an attention-based NMT model by leveraging posterior regularization. Experiments on two Chinese ⇒ English domains have well-validated the effectiveness and universality of the proposed method of reducing human translator efforts.
AB - In computer-aided translation tasks, reducing the time of reviewing and post-editing on translations is meaningful for human translators. However, existing studies mainly aim to improve overall translation quality, which only reduces post-editing time. In this work, we firstly identify testing sentences which are highly similar to training set (high-potential sentences) to reduce reviewing time, then we focus on improving corresponding translation quality greatly to reduce post-editing time. From this point, we firstly propose two novel translation memory methods to characterize similarity between sentences on syntactic and template dimensions separately. Based on that, we propose a similarity-aware neural machine translation (similarity-NMT) which consists of two independent modules: (1) Identification Module, which can identify high-potential sentences of testing set according to multi-dimensional similarity information; (2) Translation Module, which can integrate multi-dimensional similarity information of parallel training sentence pairs into an attention-based NMT model by leveraging posterior regularization. Experiments on two Chinese ⇒ English domains have well-validated the effectiveness and universality of the proposed method of reducing human translator efforts.
KW - High-potential sentences
KW - Human translator efforts
KW - Neural machine translation
KW - Translation memory
UR - http://www.scopus.com/inward/record.url?scp=85084360589&partnerID=8YFLogxK
U2 - 10.1007/s00521-020-04939-y
DO - 10.1007/s00521-020-04939-y
M3 - Article
AN - SCOPUS:85084360589
SN - 0941-0643
VL - 32
SP - 17623
EP - 17635
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 23
ER -