Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision

Xin Bi; Haojie Nie; Guoliang Zhang; Lei Hu; Yuliang Ma; Xiangguo Zhao; Ye Yuan; Guoren Wang

doi:10.1016/j.ipm.2022.103242

Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision

Xin Bi, Haojie Nie^*, Guoliang Zhang, Lei Hu, Yuliang Ma, Xiangguo Zhao, Ye Yuan, Guoren Wang

^*此作品的通讯作者

计算机学院

Northeastern University China

科研成果: 期刊稿件 › 文章 › 同行评审

45 引用（Scopus）

摘要

Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.

源语言	英语
文章编号	103242
期刊	Information Processing and Management
卷	60
期	2
DOI	https://doi.org/10.1016/j.ipm.2022.103242
出版状态	已出版 - 3月 2023

访问文件

10.1016/j.ipm.2022.103242

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{ba4e017351f74451bfa1bc5863d8570f,

title = "Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision",

abstract = "Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.",

keywords = "Augmented intelligence for decision-making, Knowledge graph-based question answering, Multi-hop reasoning, Weak supervision",

author = "Xin Bi and Haojie Nie and Guoliang Zhang and Lei Hu and Yuliang Ma and Xiangguo Zhao and Ye Yuan and Guoren Wang",

note = "Publisher Copyright: {\textcopyright} 2022 Elsevier Ltd",

year = "2023",

month = mar,

doi = "10.1016/j.ipm.2022.103242",

language = "English",

volume = "60",

journal = "Information Processing and Management",

issn = "0306-4573",

publisher = "Elsevier Ltd.",

number = "2",

}

TY - JOUR

T1 - Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision

AU - Bi, Xin

AU - Nie, Haojie

AU - Zhang, Guoliang

AU - Hu, Lei

AU - Ma, Yuliang

AU - Zhao, Xiangguo

AU - Yuan, Ye

AU - Wang, Guoren

PY - 2023/3

Y1 - 2023/3

N2 - Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.

AB - Among existing knowledge graph based question answering (KGQA) methods, relation supervision methods require labeled intermediate relations for stepwise reasoning. To avoid this enormous cost of labeling on large-scale knowledge graphs, weak supervision methods, which use only the answer entity to evaluate rewards as supervision, have been introduced. However, lacking intermediate supervision raises the issue of sparse rewards, which may result in two types of incorrect reasoning path: (1) incorrectly reasoned relations, even when the final answer entity may be correct; (2) correctly reasoned relations in a wrong order, which leads to an incorrect answer entity. To address these issues, this paper considers the multi-hop KGQA task as a Markov decision process, and proposes a model based on Reward Integration and Policy Evaluation (RIPE). In this model, an integrated reward function is designed to evaluate the reasoning process by leveraging both terminal and instant rewards. The intermediate supervision for each single reasoning hop is constructed with regard to both the fitness of the taken action and the evaluation of the unreasoned information remained in the updated question embeddings. In addition, to lead the agent to the answer entity along the correct reasoning path, an evaluation network is designed to evaluate the taken action in each hop. Extensive ablation studies and comparative experiments are conducted on four KGQA benchmark datasets. The results demonstrate that the proposed model outperforms the state-of-the-art approaches in terms of answering accuracy.

KW - Augmented intelligence for decision-making

KW - Knowledge graph-based question answering

KW - Multi-hop reasoning

KW - Weak supervision

UR - http://www.scopus.com/inward/record.url?scp=85145661976&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2022.103242

DO - 10.1016/j.ipm.2022.103242

M3 - Article

AN - SCOPUS:85145661976

SN - 0306-4573

VL - 60

JO - Information Processing and Management

JF - Information Processing and Management

IS - 2

M1 - 103242

ER -

Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision

摘要

访问文件

其它文件与链接

指纹

引用此