TY - GEN
T1 - Better Correlation and Robustness
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
AU - Yuan, Peiwen
AU - Wang, Xinglin
AU - Shi, Jiayi
AU - Sun, Bin
AU - Li, Yiwei
AU - Li, Kan
N1 - Publisher Copyright:
© 2023 Neural information processing systems foundation. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Turn-level dialogue evaluation models (TDEMs), using self-supervised learning (SSL) framework, have achieved state-of-the-art performance in open-domain dialogue evaluation. However, these models inevitably face two potential problems. First, they have low correlations with humans on medium coherence samples as the SSL framework often brings training data with unbalanced coherence distribution. Second, the SSL framework leads TDEM to nonuniform score distribution. There is a danger that the nonuniform score distribution will weaken the robustness of TDEM through our theoretical analysis. To tackle these problems, we propose Better Correlation and Robustness (BCR), a distribution-balanced self-supervised learning framework for TDEM. Given a dialogue dataset, BCR offers an effective training set reconstructing method to provide coherence-balanced training signals and further facilitate balanced evaluating abilities of TDEM. To get a uniform score distribution, a novel loss function is proposed, which can adjust adaptively according to the uniformity of score distribution estimated by kernel density estimation. Comprehensive experiments on 17 benchmark datasets show that vanilla BERT-base using BCR outperforms SOTA methods significantly by 11.3% on average. BCR also demonstrates strong generalization ability as it can lead multiple SOTA methods to attain better correlation and robustness. Code and datasets: https://github.com/ypw0102/Better-Correlation-and-Robustness.
AB - Turn-level dialogue evaluation models (TDEMs), using self-supervised learning (SSL) framework, have achieved state-of-the-art performance in open-domain dialogue evaluation. However, these models inevitably face two potential problems. First, they have low correlations with humans on medium coherence samples as the SSL framework often brings training data with unbalanced coherence distribution. Second, the SSL framework leads TDEM to nonuniform score distribution. There is a danger that the nonuniform score distribution will weaken the robustness of TDEM through our theoretical analysis. To tackle these problems, we propose Better Correlation and Robustness (BCR), a distribution-balanced self-supervised learning framework for TDEM. Given a dialogue dataset, BCR offers an effective training set reconstructing method to provide coherence-balanced training signals and further facilitate balanced evaluating abilities of TDEM. To get a uniform score distribution, a novel loss function is proposed, which can adjust adaptively according to the uniformity of score distribution estimated by kernel density estimation. Comprehensive experiments on 17 benchmark datasets show that vanilla BERT-base using BCR outperforms SOTA methods significantly by 11.3% on average. BCR also demonstrates strong generalization ability as it can lead multiple SOTA methods to attain better correlation and robustness. Code and datasets: https://github.com/ypw0102/Better-Correlation-and-Robustness.
UR - http://www.scopus.com/inward/record.url?scp=85183291176&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85183291176
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Neural information processing systems foundation
Y2 - 10 December 2023 through 16 December 2023
ER -