Removing Input Confounder for Translation Quality Estimation via a Causal Motivated Method

Xuewen Shi, Heyan Huang, Ping Jian*, Yi Kun Tang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Most state-of-the-art QE systems built upon neural networks have achieved promising performances on benchmark datasets. However, the performance of these methods can be easily influenced by the inherent features of the model input, such as the length of input sequence or the number of unseen tokens. In this paper, we introduce a causal inference based method to eliminate the negative impact caused by the characters of the input for a QE system. Specifically, we propose an iterative denoising framework for multiple confounding features. The confounder elimination operation at each iteration step is implemented by a Half-Sibling Regression based method. We conduct our experiments on the official datasets and submissions from WMT 2020 Quality Estimation Shared Task of Sentence-Level Direct Assessment. Experimental results show that the denoised QE results gain better Pearson’s correlation scores with human assessments compared to the original submissions.

Original languageEnglish
Title of host publicationWeb and Big Data - 5th International Joint Conference, APWeb-WAIM 2021, Proceedings
EditorsLeong Hou U, Marc Spaniol, Yasushi Sakurai, Junying Chen
PublisherSpringer Science and Business Media Deutschland GmbH
Pages358-364
Number of pages7
ISBN (Print)9783030858957
DOIs
Publication statusPublished - 2021
Event5th International Joint Conference on Asia-Pacific Web and Web-Age Information Management, APWeb-WAIM 2021 - Guangzhou, China
Duration: 23 Aug 202125 Aug 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12858 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference5th International Joint Conference on Asia-Pacific Web and Web-Age Information Management, APWeb-WAIM 2021
Country/TerritoryChina
CityGuangzhou
Period23/08/2125/08/21

Keywords

  • Causal inference
  • Machine translation
  • Quality estimation

Fingerprint

Dive into the research topics of 'Removing Input Confounder for Translation Quality Estimation via a Causal Motivated Method'. Together they form a unique fingerprint.

Cite this