A multiple-play bandit algorithm applied to recommender systems

Jonathan Louëdec; Max Chevalier; Josiane Mothe; Aurélien Garivier; Sébastien Gerchinovitz

Communication Dans Un Congrès Année : 2015

A multiple-play bandit algorithm applied to recommender systems

(1, 2) , (1, 2) , (1, 3) , (4) , (4)

1
2
3
4

Jonathan Louëdec

Fonction : Auteur
PersonId : 1100022

Systèmes d’Informations Généralisées

Université Toulouse III - Paul Sabatier

Max Chevalier

Fonction : Auteur
PersonId : 735509
IdHAL : max-chevalier
ORCID : 0000-0001-5402-6255
IdRef : 069989699

Systèmes d’Informations Généralisées

Université Toulouse III - Paul Sabatier

Josiane Mothe

Fonction : Auteur
PersonId : 735149
IdHAL : josianemothe
ORCID : 0000-0001-9273-2193
IdRef : 087097222

Systèmes d’Informations Généralisées

Université Toulouse - Jean Jaurès

Aurélien Garivier

Fonction : Auteur

Institut de Mathématiques de Toulouse UMR5219

Sébastien Gerchinovitz

Fonction : Auteur

Institut de Mathématiques de Toulouse UMR5219

Résumé

For several web tasks such as ad placement or e-commerce, recommender systems must recommend multiple items to their users-such problems can be modeled as bandits with multiple plays. State-of-the-art methods require running as many single-play bandit algorithms as there are items to recommend. On the contrary, some recent theoretical work in the machine learning literature designed new algorithms to address the multiple-play case directly. These algorithms were proved to have strong theoretical guarantees. In this paper we compare one such multiple-play algorithm with previous methods. We show on two real-world datasets that the multiple-play algorithm we use converges to equivalent values but learns about three times faster than state-of-the-art methods. We also show that carefully adapting these earlier methods can improve their performance.

Mots clés

Information retrieval Reinforcement learning Recommender systems Bandit problems

Domaines

Théorie de l'information [cs.IT] Recherche d'information [cs.IR]

Fichier principal

louedec_18744.pdf (314.65 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Open Archive Toulouse Archive Ouverte (OATAO) : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04077707

Soumis le : vendredi 21 avril 2023-16:05:49

Dernière modification le : lundi 20 novembre 2023-11:44:19

Archivage à long terme le : samedi 22 juillet 2023-19:02:13

Dates et versions

hal-04077707 , version 1 (21-04-2023)

Identifiants

HAL Id : hal-04077707 , version 1
OATAO : 18744

Citer

Jonathan Louëdec, Max Chevalier, Josiane Mothe, Aurélien Garivier, Sébastien Gerchinovitz. A multiple-play bandit algorithm applied to recommender systems. 28th International Florida Artificial Intelligence Research Society (FLAIRS 2015), May 2015, Hollywood, United States. pp.67-72. ⟨hal-04077707⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS INSA-TOULOUSE IMT SMS UT1-CAPITOLE INSA-GROUPE IRIT IRIT-SIG IRIT-GD TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

158 Consultations

50 Téléchargements

A multiple-play bandit algorithm applied to recommender systems

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager