Backtracking Restarts for Deep Reinforcement Learning

Autores/as

  • Zaid Khalil Marji University of South Florida
  • John Licato

DOI:

https://doi.org/10.32473/flairs.v34i1.128557

Palabras clave:

Deep Reinforcement Learning, Backtracking Restarts, Restart Distributions

Resumen

Manipulating the starting states of a Markov Decision Process to accelerate the learning of a deep reinforcement learning agent is an idea that has been proposed in several ways in the literature. Examples include starting from random states to improve exploration, taking random walks from desired goal states, and using performance-based metrics for starting states selection policy. In this paper, we explore the idea of exploiting the RL agent's trajectories generated during training for use as starting states. The main intuition behind this proposal is to focus the training of the RL agent to overcome its current weaknesses by practicing overcoming failure states by resetting the environment to a state in its recent past. We shall call the idea of starting from a fixed (or variable) number of steps back from recent terminal or failure states `backtracking restarts'. Our empirical findings show that this modification yields tangible speedups in the learning process.

Descargas

Publicado

2021-04-18

Cómo citar

Marji, Z. K., & Licato, J. (2021). Backtracking Restarts for Deep Reinforcement Learning. The International FLAIRS Conference Proceedings, 34. https://doi.org/10.32473/flairs.v34i1.128557

Número

Sección

Main Track Proceedings