Toward a data efficient neural actor-critic

Matthieu Zimmer; Yann Boniface; Alain Dutech

Communication Dans Un Congrès Année : 2016

Toward a data efficient neural actor-critic

(1, 2) , (2) , (1)

1
2

Matthieu Zimmer

Fonction : Auteur
PersonId : 9288
IdHAL : matthieu-zimmer
ORCID : 0000-0002-8029-308X

Autonomous intelligent machine

Université de Lorraine

Yann Boniface

Fonction : Auteur
PersonId : 778629
IdRef : 140410805

Université de Lorraine

Alain Dutech

Fonction : Auteur
PersonId : 1580
IdHAL : alain-dutech
ORCID : 0000-0001-7549-7988
IdRef : 131102532

Autonomous intelligent machine

Résumé

A new off-policy, offline, model-free, actor-critic reinforcement learning algorithm dealing with continuous environments in both states and actions is presented. It addresses discrete time problems where the goal is to maximize the discounted sum of rewards using stationary policies. Our algorithm allows to trade-off between data-efficiency and scalability. The amount of a priori knowledge is kept low by: (1) using neural networks to learn both the critic and the actor, (2) not relying on initial trajectories provided by an expert, and (3) not depending on known goal states. Experimental results show better data-efficiency than 4 state-of-the-art algorithms on two benchmark environments.

Mots clés

Continuous Spaces Actor-Critic Neural Networks

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

ewrl13-2016-submission_7.pdf (783.25 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Matthieu Zimmer : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01413885

Soumis le : dimanche 11 décembre 2016-18:06:29

Dernière modification le : lundi 11 septembre 2023-17:41:18

Archivage à long terme le : mardi 28 mars 2017-00:38:19

Dates et versions

hal-01413885 , version 1 (11-12-2016)

Identifiants

HAL Id : hal-01413885 , version 1

Citer

Matthieu Zimmer, Yann Boniface, Alain Dutech. Toward a data efficient neural actor-critic. EWRL 2016 - The 13th European Workshop on Reinforcement Learning, Dec 2016, Barcelona, Spain. ⟨hal-01413885⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-AIS SILECS

354 Consultations

210 Téléchargements

Toward a data efficient neural actor-critic

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager