Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation

Xinglin Wang; Jiayi Shi; Peiwen Yuan; Kan Li

Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation

Xinglin Wang, Jiayi Shi, Peiwen Yuan, Kan Li^*

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Contribution to conference › Paper › peer-review

Abstract

Open-domain automatic dialogue evaluation plays an important role in dialogue systems. While recent efforts are being put into making learning-based evaluation metrics correlate better with human evaluation, robust metrics for parallel corpora and multiple domains remain unexplored. Parallel corpora refer to corpora that express the same idea in different ways (e.g., translation, paraphrasing and back-translation). In this paper, we propose Parallel Corpora Alignment Framework (PCAF), which improves the consistency and robustness of model evaluation on parallel corpora. Firstly, parallel corpora are aligned in semantic space through parallel-corpora-aligned contrastive learning. Then, parallel-corpora-aligned distillation on multiple datasets is applied to further improve model’s generalization ability across multiple data domains. Our approach ranks second on the final test data of DSTC11 track4 sub-task1 ("Multilingual Automatic Evaluation Metrics", turn-level) and third on the sub-task2 ("Robust Automatic Evaluation Metrics", turn-level), which proves the strong generalization ability and robustness of our proposed approach.

Original language	English
Pages	123-132
Number of pages	10
Publication status	Published - 2023
Event	11th Dialog System Technology Challenge, DSTC 2023 - Prague, Czech Republic Duration: 11 Sept 2023 → …

Conference

Conference	11th Dialog System Technology Challenge, DSTC 2023
Country/Territory	Czech Republic
City	Prague
Period	11/09/23 → …

Cite this

Wang, X., Shi, J., Yuan, P., & Li, K. (2023). Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation. 123-132. Paper presented at 11th Dialog System Technology Challenge, DSTC 2023, Prague, Czech Republic.

@conference{f03f81686a2b4e90a81e065921304984,

title = "Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation",

abstract = "Open-domain automatic dialogue evaluation plays an important role in dialogue systems. While recent efforts are being put into making learning-based evaluation metrics correlate better with human evaluation, robust metrics for parallel corpora and multiple domains remain unexplored. Parallel corpora refer to corpora that express the same idea in different ways (e.g., translation, paraphrasing and back-translation). In this paper, we propose Parallel Corpora Alignment Framework (PCAF), which improves the consistency and robustness of model evaluation on parallel corpora. Firstly, parallel corpora are aligned in semantic space through parallel-corpora-aligned contrastive learning. Then, parallel-corpora-aligned distillation on multiple datasets is applied to further improve model{\textquoteright}s generalization ability across multiple data domains. Our approach ranks second on the final test data of DSTC11 track4 sub-task1 ({"}Multilingual Automatic Evaluation Metrics{"}, turn-level) and third on the sub-task2 ({"}Robust Automatic Evaluation Metrics{"}, turn-level), which proves the strong generalization ability and robustness of our proposed approach.",

author = "Xinglin Wang and Jiayi Shi and Peiwen Yuan and Kan Li",

note = "Publisher Copyright: {\textcopyright} 2023 Association for Computational Linguistics.; 11th Dialog System Technology Challenge, DSTC 2023 ; Conference date: 11-09-2023",

year = "2023",

language = "English",

pages = "123--132",

}

TY - CONF

T1 - Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation

AU - Wang, Xinglin

AU - Shi, Jiayi

AU - Yuan, Peiwen

AU - Li, Kan

PY - 2023

Y1 - 2023

N2 - Open-domain automatic dialogue evaluation plays an important role in dialogue systems. While recent efforts are being put into making learning-based evaluation metrics correlate better with human evaluation, robust metrics for parallel corpora and multiple domains remain unexplored. Parallel corpora refer to corpora that express the same idea in different ways (e.g., translation, paraphrasing and back-translation). In this paper, we propose Parallel Corpora Alignment Framework (PCAF), which improves the consistency and robustness of model evaluation on parallel corpora. Firstly, parallel corpora are aligned in semantic space through parallel-corpora-aligned contrastive learning. Then, parallel-corpora-aligned distillation on multiple datasets is applied to further improve model’s generalization ability across multiple data domains. Our approach ranks second on the final test data of DSTC11 track4 sub-task1 ("Multilingual Automatic Evaluation Metrics", turn-level) and third on the sub-task2 ("Robust Automatic Evaluation Metrics", turn-level), which proves the strong generalization ability and robustness of our proposed approach.

AB - Open-domain automatic dialogue evaluation plays an important role in dialogue systems. While recent efforts are being put into making learning-based evaluation metrics correlate better with human evaluation, robust metrics for parallel corpora and multiple domains remain unexplored. Parallel corpora refer to corpora that express the same idea in different ways (e.g., translation, paraphrasing and back-translation). In this paper, we propose Parallel Corpora Alignment Framework (PCAF), which improves the consistency and robustness of model evaluation on parallel corpora. Firstly, parallel corpora are aligned in semantic space through parallel-corpora-aligned contrastive learning. Then, parallel-corpora-aligned distillation on multiple datasets is applied to further improve model’s generalization ability across multiple data domains. Our approach ranks second on the final test data of DSTC11 track4 sub-task1 ("Multilingual Automatic Evaluation Metrics", turn-level) and third on the sub-task2 ("Robust Automatic Evaluation Metrics", turn-level), which proves the strong generalization ability and robustness of our proposed approach.

UR - http://www.scopus.com/inward/record.url?scp=85184823815&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85184823815

SP - 123

EP - 132

T2 - 11th Dialog System Technology Challenge, DSTC 2023

Y2 - 11 September 2023

ER -

Parallel Corpora Alignment Framework for Multilingual and Robust Automatic Dialogue Evaluation

Abstract

Conference

Other files and links

Fingerprint

Cite this