A Sequential Nonparametric Two-Sample Test - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Rapport (Rapport De Recherche) Année : 2015

A Sequential Nonparametric Two-Sample Test

Un Test Non-paramétrique d'Homogénéité Séquentiel

Alix Lhéritier
Frédéric Cazals

Résumé

Given samples from two distributions, a nonparametric two-sample test aims at determining whether the two distributions are equal or not, based on a test statistic. This statistic may be computed on the whole dataset, or may be computed on a subset of the dataset by a function trained on its complement. We propose a third tier, consisting of functions exploiting a sequential framework to learn the differences while incrementally processing the data. Sequential processing naturally allows optional stopping, which makes our test the first truly sequential nonparametric two-sample test. We show that any sequential predictor can be turned into a sequential two-sample test for which a valid $p$-value can be computed, yielding controlled type I error. We also show that pointwise universal predictors yield consistent tests, which can be built with a nonparametric regressor based on $k$-nearest neighbors in particular. We also show that mixtures and switch distributions can be used to increase power, while keeping consistency.
Fichier principal
Vignette du fichier
RR-8704-v2.pdf (768.06 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01135608 , version 1 (25-03-2015)
hal-01135608 , version 2 (02-06-2015)

Identifiants

  • HAL Id : hal-01135608 , version 2

Citer

Alix Lhéritier, Frédéric Cazals. A Sequential Nonparametric Two-Sample Test. [Research Report] RR-8704, Inria. 2015, pp.18. ⟨hal-01135608v2⟩
307 Consultations
801 Téléchargements

Partager

Gmail Facebook X LinkedIn More