Trustworthy machine reading comprehension with conditional adversarial calibration

Zhijing Wu, Hua Xu*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

1 引用 (Scopus)

摘要

Machine Reading Comprehension (MRC) has achieved impressive answer inference performance in recent years but rarely considers the trustworthiness and reliability of the deployed systems. However, it is crucial to estimate the predictive uncertainty in real-world applications to measure how likely the prediction is wrong. Hence it is possible to abstain from the uncertain prediction with low confidence and build a trustworthy system. Prior studies use post-processing ways to measure the predictive uncertainty, such as employing heuristic softmax probability or training a calibrator on top of a trained MRC model. However, they only calibrate the confidence without considering the domain adaptation relationship. To handle the limitations, this paper presents TrustMRC, a non-postprocessing trustworthy MRC system that leverages (1) conditional calibration strategy to get reliable uncertainty, and (2) conditional adversarial learning strategy to learn transfer representations under domain shift setting. On the one hand, to estimate the predictive uncertainty, a conditional calibration module is proposed to predict whether the output of the answer prediction module is correct, and it is combined with an additional ECE constraint to restrict the confidence more reliable. On the other hand, for domain shift, TrustMRC designs a conditional adversarial learning strategy to learn transfer representations through a domain discriminator with uncertainty constraints, which takes both input and uncertainty alignment into account. Besides, TrustMRC is a non-postprocessing model that completes the answer prediction and uncertainty prediction in an end-to-end framework, so that these two sub-tasks can benefit from each other via multi-task learning. Instead of traditional EM and F1 metrics, EM-coverage and F1-coverage curves are used, for the trustworthiness-aware MRC evaluation. The experimental results on SQuAD 1.1, Natural Questions, and NewsQA datasets indicate that TrustMRC can make reliable predictions under domain shift settings.

源语言英语
页(从-至)14298-14315
页数18
期刊Applied Intelligence
53
11
DOI
出版状态已出版 - 6月 2023
已对外发布

指纹

探究 'Trustworthy machine reading comprehension with conditional adversarial calibration' 的科研主题。它们共同构成独一无二的指纹。

引用此

Wu, Z., & Xu, H. (2023). Trustworthy machine reading comprehension with conditional adversarial calibration. Applied Intelligence, 53(11), 14298-14315. https://doi.org/10.1007/s10489-022-04235-3