Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study from an Optimization Perspective

Xin Li; Haojie Lei; Li Zhang; Mingzhong Wang

doi:10.1109/TPAMI.2023.3285634

Back

Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study from an Optimization Perspective

Journal article

Peer reviewed

Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study from an Optimization Perspective

Xin Li, Haojie Lei, Li Zhang and Mingzhong Wang

IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.45(10), pp.11654-11667

2023

DOI: https://doi.org/10.1109/TPAMI.2023.3285634

PMID: 37310843

Files and links (1)

url

https://doi.org/10.1109/TPAMI.2023.3285634View

Published Version

Abstract

deep reinforcement learning

policy optimisation

interpretable reinforcement learning

machine learning

The interpretability of policies remains an important challenge in Deep Reinforcement Learning (DRL). This paper explores interpretable DRL via representing policy by Differentiable Inductive Logic Programming (DILP) and provides a theoretical and empirical study of DILP-based policy learning from an optimization perspective. We first identified a fundamental fact that DILP-based policy learning should be solved as a constrained policy optimization problem. We then proposed to use Mirror Descent for policy optimization (MDPO) to deal with the constraints of DILP-based policies. We derived the closed-form regret bound of MDPO with function approximation, which is helpful to the design of DRL frameworks. Moreover, we studied the convexity of DILP-based policy to further verify the benefits gained from MDPO. Empirically, we experimented MDPO, its on-policy variant, and 3 mainstream policy learning methods, and the results verified our theoretical analysis.

Details

Title: Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study from an Optimization Perspective
Authors: Xin Li (Author) - Beijing Institute of Technology
Haojie Lei (Author) - Beijing Institute of Technology
Li Zhang (Author) - Beijing Institute of Technology
Mingzhong Wang (Author) - University of the Sunshine Coast, Queensland, School of Science, Technology and Engineering
Publication details: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.45(10), pp.11654-11667
Publisher: Institute of Electrical and Electronics Engineers
DOI: 10.1109/TPAMI.2023.3285634
ISSN: 1939-3539
PMID: 37310843
Organisation Unit: School of Science, Technology and Engineering
Language: English
Record Identifier: 99735697702621
Output Type: Journal article

Metrics

17 Record Views

1 Times Cited - Web of Science

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types: Domestic collaboration; International collaboration
Web Of Science research areas: Computer Science, Artificial Intelligence; Engineering, Electrical & Electronic

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

Source: InCites