Sharing Information in Adversarial Bandit

David L. Saint-Pierre; Olivier Teytaud

doi:10.1007/978-3-662-45523-4_32

Communication Dans Un Congrès Année : 2014

Sharing Information in Adversarial Bandit

(1, 2) , (2, 1)

1
2

David L. Saint-Pierre

Fonction : Auteur

Laboratoire de Recherche en Informatique

Machine Learning and Optimisation

Olivier Teytaud

Fonction : Auteur

Machine Learning and Optimisation

Laboratoire de Recherche en Informatique

Résumé

2-Player games in general provide a popular platform for research in Artificial Intelligence (AI). One of the main challenges coming from this plat-form is approximating a Nash Equilibrium (NE) over zero-sum matrix games. While the problem of computing such a Nash Equilibrium is solvable in polyno-mial time using Linear Programming (LP), it rapidly becomes infeasible to solve as the size of the matrix grows; a situation commonly encountered in games. This paper focuses on improving the approximation of a NE for matrix games such that it outperforms the state-of-the-art algorithms given a finite (and rather small) number T of oracle requests to rewards. To reach this objective, we pro-pose to share information between the different relevant pure strategies. We show both theoretically by improving the bound and empirically by experiments on ar-tificial matrices and on a real-world game that information sharing leads to an improvement of the approximation of the NE.

Mots clés

Bandit Problem Games Monte-Carlo Nash Equilibrium

Domaines

Optimisation et contrôle [math.OC]

Fichier principal

sharinginfo (1).pdf (282.28 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Olivier Teytaud : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01116716

Soumis le : mardi 17 février 2015-09:23:50

Dernière modification le : lundi 12 février 2024-09:48:04

Archivage à long terme le : lundi 18 mai 2015-10:05:56

Dates et versions

hal-01116716 , version 1 (17-02-2015)

Identifiants

HAL Id : hal-01116716 , version 1
DOI : 10.1007/978-3-662-45523-4_32

Citer

David L. Saint-Pierre, Olivier Teytaud. Sharing Information in Adversarial Bandit. EvoGames 2014, Apr 2014, Granada, Spain. ⟨10.1007/978-3-662-45523-4_32⟩. ⟨hal-01116716⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA INSMI UMR8623 INRIA2 LRI-AO TDS-MACS UNIV-PARIS-SACLAY

182 Consultations

179 Téléchargements

Sharing Information in Adversarial Bandit

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager