Journal article
Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study from an Optimization Perspective
IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.45(10), pp.11654-11667
2023
PMID: 37310843
Abstract
The interpretability of policies remains an important challenge in Deep Reinforcement Learning (DRL). This paper explores interpretable DRL via representing policy by Differentiable Inductive Logic Programming (DILP) and provides a theoretical and empirical study of DILP-based policy learning from an optimization perspective. We first identified a fundamental fact that DILP-based policy learning should be solved as a constrained policy optimization problem. We then proposed to use Mirror Descent for policy optimization (MDPO) to deal with the constraints of DILP-based policies. We derived the closed-form regret bound of MDPO with function approximation, which is helpful to the design of DRL frameworks. Moreover, we studied the convexity of DILP-based policy to further verify the benefits gained from MDPO. Empirically, we experimented MDPO, its on-policy variant, and 3 mainstream policy learning methods, and the results verified our theoretical analysis.
Details
- Title
- Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study from an Optimization Perspective
- Authors
- Xin Li (Author) - Beijing Institute of TechnologyHaojie Lei (Author) - Beijing Institute of TechnologyLi Zhang (Author) - Beijing Institute of TechnologyMingzhong Wang (Author) - University of the Sunshine Coast, Queensland, School of Science, Technology and Engineering
- Publication details
- IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.45(10), pp.11654-11667
- Publisher
- Institute of Electrical and Electronics Engineers
- DOI
- 10.1109/TPAMI.2023.3285634
- ISSN
- 1939-3539
- PMID
- 37310843
- Organisation Unit
- School of Science, Technology and Engineering
- Language
- English
- Record Identifier
- 99735697702621
- Output Type
- Journal article
Metrics
17 Record Views
InCites Highlights
These are selected metrics from InCites Benchmarking & Analytics tool, related to this output
- Collaboration types
- Domestic collaboration
- International collaboration
- Web Of Science research areas
- Computer Science, Artificial Intelligence
- Engineering, Electrical & Electronic
UN Sustainable Development Goals (SDGs)
This output has contributed to the advancement of the following goals:
Source: InCites