A note on replacing uniform subsampling by random projections in MCMC for linear regression of tall datasets - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2015

A note on replacing uniform subsampling by random projections in MCMC for linear regression of tall datasets

Résumé

New Markov chain Monte Carlo (MCMC) methods have been proposed to tackle inference with tall datasets, i.e., when the number n of data items is intractably large. A large class of these new MCMC methods is based on randomly subsampling the dataset at each MCMC iteration. We investigate whether random projections can replace this random subsampling for linear regression of big streaming data. In the latter setting, random projections have indeed become standard for non-Bayesian treatments. We isolate two issues for MCMC to apply to streaming regression: 1) a resampling issue; MCMC should access the same random projections across iterations to avoid keeping the whole dataset in memory and 2) a budget issue; making individual MCMC acceptance decisions should require o(n) random projections. While the resampling issue can be satisfyingly tackled, current techniques in random projections and MCMC for tall data do not solve the budget issue, and may well end up showing it is not possible.
Fichier principal
Vignette du fichier
arxiv.pdf (508.84 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01248841 , version 1 (29-12-2015)

Identifiants

  • HAL Id : hal-01248841 , version 1

Citer

Rémi Bardenet, Odalric-Ambrym Maillard. A note on replacing uniform subsampling by random projections in MCMC for linear regression of tall datasets. 2015. ⟨hal-01248841⟩
408 Consultations
569 Téléchargements

Partager

Gmail Facebook X LinkedIn More