Training Dialogue Systems With Human Advice

Merwan Barlier; Romain Laroche; Olivier Pietquin

Communication Dans Un Congrès Année : 2018

Training Dialogue Systems With Human Advice

(1) , (2) , (3)

1
2
3

Merwan Barlier

Fonction : Auteur
PersonId : 1040020

Sequential Learning

Romain Laroche

Fonction : Auteur
PersonId : 1012067

Microsoft Research

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Résumé

One major drawback of Reinforcement Learning (RL) Spoken Dialogue Systems is that they inherit from the general exploration requirements of RL which makes them hard to deploy from an industry perspective. On the other hand, industrial systems rely on human expertise and hand written rules so as to avoid irrelevant behavior to happen and maintain acceptable experience from the user point of view. In this paper, we attempt to bridge the gap between those two worlds by providing an easy way to incorporate all kinds of human expertise in the training phase of a Reinforcement Learning Dialogue System. Our approach, based on the TAMER framework, enables safe and efficient policy learning by combining the traditional Reinforcement Learning reward signal with an additional reward, encoding expert advice. Experimental results show that our method leads to substantial improvements over more traditional Reinforcement Learning methods.

Mots clés

observation) communication Learning agent capabilities (agent models Human-robot/agent interaction

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG]

Fichier principal

1-4-4-AAMAS-Training-Dialogue-Systems-with-Human-Advice.pdf (693.05 Ko)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Merwan Barlier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01945831

Soumis le : mardi 11 décembre 2018-22:23:45

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Archivage à long terme le : mardi 12 mars 2019-12:24:13

Dates et versions

hal-01945831 , version 1 (11-12-2018)

Identifiants

HAL Id : hal-01945831 , version 1

Citer

Merwan Barlier, Romain Laroche, Olivier Pietquin. Training Dialogue Systems With Human Advice. AAMAS 2018 - the 17th International Conference on Autonomous Agents and Multiagent Systems, Jul 2018, Stockholm, Sweden. pp.9. ⟨hal-01945831⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE

66 Consultations

132 Téléchargements

Training Dialogue Systems With Human Advice

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager