Backtracking Restarts for Deep Reinforcement Learning

Zaid Khalil Marji; John Licato

doi:10.32473/flairs.v34i1.128557

Authors

Zaid Khalil Marji University of South Florida
John Licato

DOI:

https://doi.org/10.32473/flairs.v34i1.128557

Keywords:

Deep Reinforcement Learning, Backtracking Restarts, Restart Distributions

Abstract

Manipulating the starting states of a Markov Decision Process to accelerate the learning of a deep reinforcement learning agent is an idea that has been proposed in several ways in the literature. Examples include starting from random states to improve exploration, taking random walks from desired goal states, and using performance-based metrics for starting states selection policy. In this paper, we explore the idea of exploiting the RL agent's trajectories generated during training for use as starting states. The main intuition behind this proposal is to focus the training of the RL agent to overcome its current weaknesses by practicing overcoming failure states by resetting the environment to a state in its recent past. We shall call the idea of starting from a fixed (or variable) number of steps back from recent terminal or failure states `backtracking restarts'. Our empirical findings show that this modification yields tangible speedups in the learning process.

Backtracking Restarts for Deep Reinforcement Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Developed By

Make a Submission

Language