TY - GEN
T1 - Off-Policy Differentiable Logic Reinforcement Learning
AU - Zhang, Li
AU - Li, Xin
AU - Wang, Mingzhong
AU - Tian, Andong
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - In this paper, we proposed an Off-Policy Differentiable Logic Reinforcement Learning (OPDLRL) framework to inherit the benefits of interpretability and generalization ability in Differentiable Inductive Logic Programming (DILP) and also resolves its weakness of execution efficiency, stability, and scalability. The key contributions include the use of approximate inference to significantly reduce the number of logic rules in the deduction process, an off-policy training method to enable approximate inference, and a distributed and hierarchical training framework. Extensive experiments, specifically playing real-time video games in Rabbids against human players, show that OPDLRL has better or similar performance as other DILP-based methods but far more practical in terms of sample efficiency and execution efficiency, making it applicable to complex and (near) real-time domains.
AB - In this paper, we proposed an Off-Policy Differentiable Logic Reinforcement Learning (OPDLRL) framework to inherit the benefits of interpretability and generalization ability in Differentiable Inductive Logic Programming (DILP) and also resolves its weakness of execution efficiency, stability, and scalability. The key contributions include the use of approximate inference to significantly reduce the number of logic rules in the deduction process, an off-policy training method to enable approximate inference, and a distributed and hierarchical training framework. Extensive experiments, specifically playing real-time video games in Rabbids against human players, show that OPDLRL has better or similar performance as other DILP-based methods but far more practical in terms of sample efficiency and execution efficiency, making it applicable to complex and (near) real-time domains.
KW - Deep reinforcement learning
KW - Interpretable reinforcement learning
KW - Neural-Symbolic AI
UR - http://www.scopus.com/inward/record.url?scp=85115693688&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-86520-7_38
DO - 10.1007/978-3-030-86520-7_38
M3 - Conference contribution
AN - SCOPUS:85115693688
SN - 9783030865191
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 617
EP - 632
BT - Machine Learning and Knowledge Discovery in Databases. Research Track - European Conference, ECML PKDD 2021, Proceedings
A2 - Oliver, Nuria
A2 - Pérez-Cruz, Fernando
A2 - Kramer, Stefan
A2 - Read, Jesse
A2 - Lozano, Jose A.
PB - Springer Science and Business Media Deutschland GmbH
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2021
Y2 - 13 September 2021 through 17 September 2021
ER -