Similarity-aware neural machine translation: reducing human translator efforts by leveraging high-potential sentences with translation memory

Research output: Contribution to journalArticlepeer-review

Abstract

In computer-aided translation tasks, reducing the time of reviewing and post-editing on translations is meaningful for human translators. However, existing studies mainly aim to improve overall translation quality, which only reduces post-editing time. In this work, we firstly identify testing sentences which are highly similar to training set (high-potential sentences) to reduce reviewing time, then we focus on improving corresponding translation quality greatly to reduce post-editing time. From this point, we firstly propose two novel translation memory methods to characterize similarity between sentences on syntactic and template dimensions separately. Based on that, we propose a similarity-aware neural machine translation (similarity-NMT) which consists of two independent modules: (1) Identification Module, which can identify high-potential sentences of testing set according to multi-dimensional similarity information; (2) Translation Module, which can integrate multi-dimensional similarity information of parallel training sentence pairs into an attention-based NMT model by leveraging posterior regularization. Experiments on two Chinese ⇒ English domains have well-validated the effectiveness and universality of the proposed method of reducing human translator efforts.

Original languageEnglish
Pages (from-to)17623-17635
Number of pages13
JournalNeural Computing and Applications
Volume32
Issue number23
DOIs
Publication statusPublished - Dec 2020

Keywords

  • High-potential sentences
  • Human translator efforts
  • Neural machine translation
  • Translation memory

Fingerprint

Dive into the research topics of 'Similarity-aware neural machine translation: reducing human translator efforts by leveraging high-potential sentences with translation memory'. Together they form a unique fingerprint.

Cite this