Judging competitions and benchmarks: a candidate election approach

Adrien Pavao; Michael Vaccaro; Isabelle Guyon

Communication Dans Un Congrès Année : 2021

Judging competitions and benchmarks: a candidate election approach

(1) , (2) , (1)

1
2

Adrien Pavao

Fonction : Auteur
PersonId : 1049181
IdHAL : adrien-pavao
ORCID : 0000-0001-7374-5095

TAckling the Underspecified

Michael Vaccaro

Fonction : Auteur

Laboratoire Interdisciplinaire des Sciences du Numérique

Isabelle Guyon

Fonction : Auteur

TAckling the Underspecified

Résumé

Machine learning progress relies on algorithm benchmarks. We study the problem of declaring a winner, or ranking "candidate" algorithms, based on results obtained by "judges" (scores on various tasks). Inspired by social science and game theory on fair elections, we compare various ranking functions, ranging from simple score averaging to Condorcet methods. We devise novel empirical criteria to assess the quality of ranking functions, including the generalization to new tasks and the stability under judge or candidate perturbation. We conduct an empirical comparison on the results of 5 competitions and benchmarks (one artificially generated). While prior theoretical analyses indicate that no single ranking function satisfies all desired properties, our empirical study reveals that the classical "average rank" method fares well. However, some pairwise comparison methods can get better empirical results.

Domaines

Intelligence artificielle [cs.AI] Statistiques [math.ST] Machine Learning [stat.ML]

Fichier principal

Judging_Competitions_ESANN_HAL.pdf (362.45 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Adrien Pavao : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03367857

Soumis le : jeudi 2 décembre 2021-12:48:47

Dernière modification le : vendredi 24 mars 2023-14:53:24

Dates et versions

hal-03367857 , version 1 (06-10-2021)

hal-03367857 , version 2 (02-12-2021)

hal-03367857 , version 3 (06-01-2022)

Identifiants

HAL Id : hal-03367857 , version 2

Citer

Adrien Pavao, Michael Vaccaro, Isabelle Guyon. Judging competitions and benchmarks: a candidate election approach. ESANN 2021 - 29th European Symposium on Artificial Neural Networks, Oct 2021, Bruges, Belgium. ⟨hal-03367857v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

184 Consultations

188 Téléchargements

Judging competitions and benchmarks: a candidate election approach

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager