TY - JOUR
T1 - Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision
AU - Bi, Xin
AU - Nie, Haojie
AU - Zhang, Guoliang
AU - Hu, Lei
AU - Ma, Yuliang
AU - Zhao, Xiangguo
AU - Yuan, Ye
AU - Wang, Guoren
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2023/3
Y1 - 2023/3
N2 - Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.
AB - Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.
KW - Augmented intelligence for decision-making
KW - Knowledge graph-based question answering
KW - Multi-hop reasoning
KW - Weak supervision
UR - http://www.scopus.com/inward/record.url?scp=85145661976&partnerID=8YFLogxK
U2 - 10.1016/j.ipm.2022.103242
DO - 10.1016/j.ipm.2022.103242
M3 - Article
AN - SCOPUS:85145661976
SN - 0306-4573
VL - 60
JO - Information Processing and Management
JF - Information Processing and Management
IS - 2
M1 - 103242
ER -