Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Article Dans Une Revue Machine Learning Année : 2013

Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers

Résumé

In subset ranking, the goal is to learn a ranking function that approximates a gold standard partial ordering of a set of objects (in our case, a set of documents retrieved for the same query). The partial ordering is given by relevance labels representing the relevance of documents with respect to the query on an absolute scale. Our approach consists of three simple steps. First, we train standard multi-class classifiers (AdaBoost.MH and multi-class SVM) to discriminate between the relevance labels. Second, the posteriors of multi-class classifiers are calibrated using probabilistic and regression losses in order to estimate the Bayes-scoring function which optimizes the Normalized Discounted Cumulative Gain (NDCG). In the third step, instead of selecting the best multi-class hyperparameters and the best calibration, we mix all the learned models in a simple ensemble scheme. Our extensive experimental study is itself a substantial contribution. We compare most of the existing learning-to-rank techniques on all of the available large-scale benchmark data sets using a standardized implementation of the NDCG score. We show that our approach is competitive with conceptually more complex listwise and pairwise methods, and clearly outperforms them as the data size grows. As a technical contribution, we clarify some of the confusing results related to the ambiguities of the evaluation tools, and propose guidelines for future studies.

Dates et versions

in2p3-00869803 , version 1 (04-10-2013)

Identifiants

Citer

Róbert Busa-Fekete, Balázs Kégl, Tamás Éltetõ, György Szarvas. Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers. Machine Learning, 2013, 93 (2-3), pp.261-292. ⟨10.1007/s10994-013-5360-9⟩. ⟨in2p3-00869803⟩
64 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More