Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2018

Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly

Benjamin Guedj
Le Li
  • Fonction : Auteur
  • PersonId : 975837

Résumé

When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. Principal curves act as a nonlinear generalization of PCA and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret bound and performance on a toy example and seismic data.
Fichier principal
Vignette du fichier
main-pcurves.pdf (1.23 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01796011 , version 1 (18-05-2018)
hal-01796011 , version 2 (08-05-2019)

Identifiants

  • HAL Id : hal-01796011 , version 1

Citer

Benjamin Guedj, Le Li. Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly. 2018. ⟨hal-01796011v1⟩
111 Consultations
179 Téléchargements

Partager

Gmail Facebook X LinkedIn More