Model-Free Reinforcement Learning with Continuous Action in Practice - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Model-Free Reinforcement Learning with Continuous Action in Practice

Thomas Degris
  • Fonction : Auteur
  • PersonId : 934007
Patrick M. Pilarski
  • Fonction : Auteur
Richard S. Sutton
  • Fonction : Auteur

Résumé

Reinforcement learning methods are often con- sidered as a potential solution to enable a robot to adapt to changes in real time to an unpredictable environment. However, with continuous action, only a few existing algorithms are practical for real-time learning. In such a setting, most effective methods have used a parameterized policy structure, often with a separate parameterized value function. The goal of this paper is to assess such actor-critic methods to form a fully specified practical algorithm. Our specific contributions include 1) developing the extension of existing incremental policy-gradient algorithms to use eligibility traces, 2) an empir- ical comparison of the resulting algorithms using continuous actions, 3) the evaluation of a gradient-scaling technique that can significantly improve performance. Finally, we apply our actor-critic algorithm to learn on a robotic platform with a fast sensorimotor cycle (10ms). Overall, these results constitute an important step towards practical real-time learning control with continuous action.
Fichier principal
Vignette du fichier
DegrisACC2012.pdf (772.59 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00764281 , version 1 (12-12-2012)

Identifiants

  • HAL Id : hal-00764281 , version 1

Citer

Thomas Degris, Patrick M. Pilarski, Richard S. Sutton. Model-Free Reinforcement Learning with Continuous Action in Practice. American Control Conference, Jun 2012, Montreal, Canada. ⟨hal-00764281⟩
304 Consultations
5685 Téléchargements

Partager

Gmail Facebook X LinkedIn More