The price of unfairness in linear bandits with biased feedback

Solenne Gaucher; Alexandra Carpentier; Christophe Giraud

Pré-Publication, Document De Travail Année : 2022

The price of unfairness in linear bandits with biased feedback

(1, 2) , (3) , (1, 2)

1
2
3

Solenne Gaucher

Fonction : Auteur
PersonId : 1079870

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Alexandra Carpentier

Fonction : Auteur

University of Potsdam = Universität Potsdam

Christophe Giraud

Fonction : Auteur
PersonId : 1102992
IdHAL : cgiraud
ORCID : 0009-0004-1836-5742

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Résumé

Artificial intelligence is increasingly used in a wide range of decision making scenarios with higher and higher stakes. At the same time, recent work has highlighted that these algorithms can be dangerously biased, and that their results often need to be corrected to avoid leading to unfair decisions. In this paper, we study the problem of sequential decision making with biased linear bandit feedback. At each round, a player selects an action described by a covariate and by a sensitive attribute. She receives a reward corresponding to the covariates of the action that she has chosen, but only observe a biased evaluation of this reward, where the bias depends on the sensitive attribute. To tackle this problem, we design a Fair Phased Elimination algorithm. We establish an upper bound on its worst-case regret, showing that it is smaller than Cκ 1/3 * log(T) 1/3 T 2/3 , where C is a numerical constant and κ * an explicit geometrical constant characterizing the difficulty of bias estimation. The worst case regret is higher than the dT 1/2 log(T) regret rate obtained under unbiased feedback. We show that this rate cannot be improved for all instances : we obtain lower bounds on the worst-case regret for some sets of actions showing that this rate is tight up to a sub-logarithmic factor. We also obtain gap-dependent upper bounds on the regret, and establish matching lower bounds for some problem instance. Interestingly, the gap-dependent rates reveal the existence of non-trivial instances where the problem is no more difficult than its unbiased counterpart.

Domaines

Statistiques [math.ST] Machine Learning [stat.ML]

Fichier principal

main.pdf (718.52 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Solenne Gaucher : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03611628

Soumis le : jeudi 17 mars 2022-11:43:13

Dernière modification le : mardi 12 mars 2024-07:10:03

Dates et versions

hal-03611628 , version 1 (17-03-2022)

hal-03611628 , version 2 (01-06-2022)

hal-03611628 , version 3 (29-11-2022)

Identifiants

HAL Id : hal-03611628 , version 1
ARXIV : 2203.09784

Citer

Solenne Gaucher, Alexandra Carpentier, Christophe Giraud. The price of unfairness in linear bandits with biased feedback. 2022. ⟨hal-03611628v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

217 Consultations

106 Téléchargements

The price of unfairness in linear bandits with biased feedback

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager