High-dimensional approximate nearest neighbor: k -d Generalized Randomized Forests - INRIA - Institut National de Recherche en Informatique et en Automatique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2016

High-dimensional approximate nearest neighbor: k -d Generalized Randomized Forests

Résumé

We propose a new data-structure, the generalized randomized k-d forest, or k-d GeRaF, for approximate nearest neighbor searching in high dimensions. In particular, we introduce new ran-domization techniques to specify a set of independently constructed trees where search is performed simultaneously, hence increasing accuracy. We omit backtracking, and we optimize distance computations , thus accelerating queries. We release public domain software GeRaF and we compare it to existing implementations of state-of-the-art methods including BBD-trees, Locality Sensitive Hashing, randomized k-d forests, and product quantization. Experimental results indicate that our method would be the method of choice in dimensions around 1,000, and probably up to 10,000, and pointsets of cardinality up to a few hundred thousands or even one million; this range of inputs is encountered in many critical applications today. For instance, we handle a real dataset of 10 6 images represented in 960 dimensions with a query time of less than 1sec on average and 90% responses being true nearest neighbors.
Fichier principal
Vignette du fichier
1603.09596.pdf (199.23 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02370318 , version 1 (19-11-2019)

Identifiants

Citer

Yannis Avrithis, Ioannis Z. Emiris, Giorgos Samaras. High-dimensional approximate nearest neighbor: k -d Generalized Randomized Forests. 2016. ⟨hal-02370318⟩
41 Consultations
93 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More