Judging competitions and benchmarks: a candidate election approach - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Judging competitions and benchmarks: a candidate election approach

Résumé

Machine learning progress relies on algorithm benchmarks. We study the problem of declaring a winner, or ranking "candidate" algorithms, based on results obtained by "judges" (scores on various tasks). Inspired by social science and game theory on fair elections, we compare various ranking functions, ranging from simple score averaging to Condorcet methods. We devise novel empirical criteria to assess the quality of ranking functions, including the generalization to new tasks and the stability under judge or candidate perturbation. We conduct an empirical comparison on the results of 5 competitions and benchmarks (one artificially generated). While prior theoretical analyses indicate that no single ranking function satisfies all desired properties, our empirical study reveals that the classical "average rank" method fares well. However, some pairwise comparison methods can get better empirical results.
Fichier principal
Vignette du fichier
Judging_Competitions_ESANN_HAL.pdf (362.45 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03367857 , version 1 (06-10-2021)
hal-03367857 , version 2 (02-12-2021)
hal-03367857 , version 3 (06-01-2022)

Identifiants

  • HAL Id : hal-03367857 , version 2

Citer

Adrien Pavao, Michael Vaccaro, Isabelle Guyon. Judging competitions and benchmarks: a candidate election approach. ESANN 2021 - 29th European Symposium on Artificial Neural Networks, Oct 2021, Bruges, Belgium. ⟨hal-03367857v2⟩
184 Consultations
188 Téléchargements

Partager

Gmail Facebook X LinkedIn More