Explainable Reinforcement Learning for Adaptive Autonomous Agents: A Framework for Interpretable Decision-Making in Dynamic Environments
Keywords:
Explainable Reinforcement Learning, Autonomous Agents, Counterfactual Explanations, Attention Mechanisms, Interpretability, Trustworthy AIAbstract
This study proposes a novel Explainable Reinforcement Learning (XRL) framework integrating attention mechanisms and counterfactual explanations to enhance interpretability in adaptive autonomous agents. The framework addresses challenges in transparency and trust in dynamic and uncertain environments, including robotics and multi-agent decision systems. Empirical evaluations on MuJoCo and Atari benchmarks demonstrate a 25% improvement in human trust scores and performance metrics comparable to state-of-the-art black-box models. These findings contribute to developing trustworthy, explainable AI paradigms and establish a foundation for scalable, human-centered autonomous decision-making.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.