Conference paper
Off-Policy Differentiable Logic Reinforcement Learning
Machine Learning and Knowledge Discovery in Databases. Research Track, pp.1-16
European Conference on Machine Learning and Knowledge Discovery in Databases, 2021 (Virtual, 13-Sep-2021 - 17-Sep-2021)
Lecture Notes in Computer Science, 12976, Springer
2021
Abstract
In this paper, we proposed an Off-Policy Differentiable Logic Reinforcement Learning (OPDLRL) framework to inherit the benefits of interpretability and generalization ability in Differentiable Inductive Logic Programming (DILP) and also resolves its weakness of execution efficiency, stability, and scalability. The key contributions include the use of approximate inference to significantly reduce the number of logic rules in the deduction process, an off-policy training method to enable approximate inference, and a distributed and hierarchical training framework. Extensive experiments, specifically playing real-time video games in Rabbids against human players, show that OPDLRL has better or similar performance as other DILP-based methods but far more practical in terms of sample efficiency and execution efficiency, making it applicable to complex and (near) real-time domains.
Details
- Title
- Off-Policy Differentiable Logic Reinforcement Learning
- Authors
- Li Zhang (Author) - Beijing Institute of TechnologyXin Li (Corresponding Author) - Beijing Institute of TechnologyMingzhong Wang (Author) - University of the Sunshine Coast, Queensland, School of Science, Technology and EngineeringAndong Tian (Author) - Ubisoft China AI & Data Lab
- Publication details
- Machine Learning and Knowledge Discovery in Databases. Research Track, pp.1-16
- Conference details
- European Conference on Machine Learning and Knowledge Discovery in Databases, 2021 (Virtual, 13-Sep-2021 - 17-Sep-2021)
- Series
- Lecture Notes in Computer Science; 12976
- Publisher
- Springer
- DOI
- 10.1007/978-3-030-86520-7_38; 10.1007/978-3-030-86520-7
- ISSN
- 1611-3349
- ISBN
- 9783030865207
- Organisation Unit
- University of the Sunshine Coast, Queensland; School of Science, Technology and Engineering
- Language
- English
- Record Identifier
- 99571605202621
- Output Type
- Conference paper
Metrics
11 Record Views