Anytime many-armed bandits - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2007

Anytime many-armed bandits

Résumé

This paper introduces the many-armed bandit problem (ManAB), where the number of arms is large comparatively to the relevant number of time steps. While the ManAB framework is relevant to many real-world applications, the state of the art does not offer anytime algorithms handling ManAB problems. Both theory and practice suggest that two problem categories must be distinguished; the easy category includes those problems where good arms have reward probability close to 1; the difficult category includes other problems. Two algorithms termed FAILURE and MUCBT are proposed for the ManAB framework. FAILURE and its variants extend the non-anytime approach proposed for the denumerable-armed bandit and non-asymptotic bounds are shown; it works very efficiently for easy ManAB problems. Meanwhile, MUCBT efficiently deals with difficult ManAB problems.
Fichier principal
Vignette du fichier
mabcap2.pdf (972.9 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00173263 , version 1 (19-09-2007)

Identifiants

  • HAL Id : inria-00173263 , version 1

Citer

Olivier Teytaud, Sylvain Gelly, Michèle Sebag. Anytime many-armed bandits. CAP07, 2007, Grenoble, France. ⟨inria-00173263⟩
303 Consultations
392 Téléchargements

Partager

Gmail Facebook X LinkedIn More