TY - GEN
T1 - Towards Faithful Dialogs via Focus Learning
AU - Deng, Yifan
AU - Zhang, Xingsheng
AU - Huang, Heyan
AU - Hu, Yue
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Maintaining faithfulness between responses and knowledge is an important research topic for building reliable knowledge-grounded dialogue systems. Existing models heavily rely on the elaborate data engineering and increasing the model's parameters ignoring to track the tokens that significantly influence losses, which is decisive for the optimization direction of the model in each iteration. To address this issue, we propose Focus Learning (FocusL), a novel learning approach that adjusts the contribution of each token to the optimization direction by directly scaling the corresponding objective loss. Specifically, we first introduce a positioning method by utilizing relevance distributions between knowledge and each response token to locate knowledge-aware tokens. Then, we further design a relevance-to-weight transformation to provide dynamic token-level weights for adjusting the cross-entropy loss. Finally, we use the weighted loss to encourage the model to pay special attention to the knowledge utilization. Experimental results demonstrate that our method achieves the new state-of-the-art results and generates more reliable responses while maintaining training stability.
AB - Maintaining faithfulness between responses and knowledge is an important research topic for building reliable knowledge-grounded dialogue systems. Existing models heavily rely on the elaborate data engineering and increasing the model's parameters ignoring to track the tokens that significantly influence losses, which is decisive for the optimization direction of the model in each iteration. To address this issue, we propose Focus Learning (FocusL), a novel learning approach that adjusts the contribution of each token to the optimization direction by directly scaling the corresponding objective loss. Specifically, we first introduce a positioning method by utilizing relevance distributions between knowledge and each response token to locate knowledge-aware tokens. Then, we further design a relevance-to-weight transformation to provide dynamic token-level weights for adjusting the cross-entropy loss. Finally, we use the weighted loss to encourage the model to pay special attention to the knowledge utilization. Experimental results demonstrate that our method achieves the new state-of-the-art results and generates more reliable responses while maintaining training stability.
UR - http://www.scopus.com/inward/record.url?scp=85174388434&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85174388434
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 4554
EP - 4566
BT - Long Papers
PB - Association for Computational Linguistics (ACL)
T2 - 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
Y2 - 9 July 2023 through 14 July 2023
ER -