On ergodic two-armed bandits

Pierre Tarrès; Pierre Vandekerkhove

doi:10.1214/10-AAP751

Article Dans Une Revue The Annals of Applied Probability Année : 2012

On ergodic two-armed bandits

(1) , (2)

1
2

Pierre Tarrès

Fonction : Auteur

Institut de Mathématiques de Toulouse UMR5219

Pierre Vandekerkhove

Fonction : Auteur
PersonId : 832223
IdHAL : pierre-vandekerkhove
ORCID : 0000-0003-3907-7657
IdRef : 272731641

Laboratoire d'Analyse et de Mathématiques Appliquées

Résumé

A device has two arms with unknown deterministic payoffs and the aim is to asymptotically identify the best one without spending too much time on the other. The Narendra algorithm offers a stochastic procedure to this end. We show under weak ergodic assumptions on these deterministic payoffs that the procedure eventually chooses the best arm (i.e., with greatest Cesaro limit) with probability one for appropriate step sequences of the algorithm. In the case of i.i.d. payoffs, this implies a "quenched" version of the "annealed" result of Lamberton, Pag\'{e}s and Tarr\'{e}s [Ann. Appl. Probab. 14 (2004) 1424--1454] by the law of iterated logarithm, thus generalizing it.

Mots clés

two-armed bandit Convergence ergodicity stochastic algorithms two-armed bandit.

Domaines

Probabilités [math.PR]

Pierre Tarres : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00796207

Soumis le : vendredi 1 mars 2013-22:01:34

Dernière modification le : jeudi 14 mars 2024-03:09:57

Dates et versions

hal-00796207 , version 1 (01-03-2013)

Identifiants

HAL Id : hal-00796207 , version 1
ARXIV : 0905.0463
DOI : 10.1214/10-AAP751

Citer

Pierre Tarrès, Pierre Vandekerkhove. On ergodic two-armed bandits. The Annals of Applied Probability, 2012, 22 (2), pp.457-476. ⟨10.1214/10-AAP751⟩. ⟨hal-00796207⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS UNIV-MLV INSA-TOULOUSE IMT LAMA_UMR8050 CV_LAMA_UMR8050 LAMA_PS UPEC UT1-CAPITOLE INSA-GROUPE UNIV-EIFFEL UNIV-UT3 UT3-TOULOUSEINP

75 Consultations

0 Téléchargements

On ergodic two-armed bandits

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager