A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search
@article{Wang2021ACS, title={A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search}, author={Mengzhao Wang and Xiaoliang Xu and Qiang Yue and Yuxiang Wang}, journal={Proc. VLDB Endow.}, year={2021}, volume={14}, pages={1964-1978} }
Approximate nearest neighbor search (ANNS) constitutes an important operation in a multitude of applications, including recommendation systems, information retrieval, and pattern recognition. In the past decade, graph-based ANNS algorithms have been the leading paradigm in this domain, with dozens of graph-based ANNS algorithms proposed. Such algorithms aim to provide effective, efficient solutions for retrieving the nearest neighbors for a given query. Nevertheless, these efforts focus on…
Figures and Tables from this paper
30 Citations
Two-stage routing with optimized guided search and greedy algorithm on proximity graph
- Computer ScienceKnowl. Based Syst.
- 2021
FreshDiskANN: A Fast and Accurate Graph-Based ANN Index for Streaming Similarity Search
- Computer ScienceArXiv
- 2021
This paper presents the first graph-based ANNS index that reflects corpus updates into the index in real-time without compromising on search performance, and designs FreshDiskANN, a system that can index over a billion points on a workstation with an SSD and limited memory.
Tao: A Learning Framework for Adaptive Nearest Neighbor Search using Static Features Only
- Computer ScienceArXiv
- 2021
Tao, a general learning framework for Terminating ANN queries Adaptively using Only static features is developed, which achieves up to 2.69x speedup even compared to its counterpart, at the same high accuracy targets.
A new compressed cover tree guarantees a near linear parameterized complexity for all $k$-nearest neighbors search in metric spaces
- Computer Science, MathematicsArXiv
- 2021
This paper describes typical examples when past cover trees need O(n) iterations so that the overall worst-time complexity remains quadratic as for a brute-force search.
A Survey on Deep Reinforcement Learning for Data Processing and Analytics
- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2022
This work provides a comprehensive review of recent works focusing on utilizing DRL to improve data processing and analytics, and presents an introduction to key concepts, theories, and methods in DRL.
VStore: in-storage graph based vector search accelerator
- Computer ScienceDAC
- 2022
VStore is presented, a graph-based vector search solution that collaboratively optimizes accuracy, latency, memory, and data movement on large-scale vector data based on in-storage computing and exhibits significant search efficiency improvement and energy reduction.
Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints
- Computer ScienceArXiv
- 2022
This paper proposes a native hybrid query (NHQ) framework based on proximity graph (PG), which provides the specialized composite index and joint pruning modules for hybrid queries, and presents two novel navigable PGs with optimized edge selection and routing strategies, which obtain better overall performance than existing PGs.
LAN: Learning-based Approximate k-Nearest Neighbor Search in Graph Databases
- Computer Science2022 IEEE 38th International Conference on Data Engineering (ICDE)
- 2022
This paper proposes a learning-based k-ANN search method to reduce NDC and proposes a compressed GNN-graph to accelerate the neighbor ranking model and the initial node selection model, and proves that learning efficiency is improved without degrading the accuracy.
Survey on Exact kNN Queries over High-Dimensional Data Space
- Computer ScienceSensors
- 2023
This paper focuses on exact kNN queries and presents a comprehensive survey of exact approaches over high-dimensional data space, which covers 20 kNN Search methods and 9 kNN Join methods and specifically categorise the algorithms based on indexing strategies, data and space partitioning techniques and the computing paradigm.
Automating Nearest Neighbor Search Configuration with Constrained Optimization
- Computer ScienceArXiv
- 2023
The approximate nearest neighbor (ANN) search problem is fundamental to efficiently serving many real-world machine learning applications. A number of techniques have been developed for ANN search…
References
SHOWING 1-10 OF 111 REFERENCES
Approximate Nearest Neighbor Search on High Dimensional Data — Experiments, Analyses, and Improvement
- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2020
A comprehensive experimental evaluation of many state-of-the-art methods for approximate nearest neighbor search and proposes a new method that achieves both high query efficiency and high recall empirically on majority of the datasets under a wide range of settings.
Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph
- Computer ScienceProc. VLDB Endow.
- 2019
A novel graph structure called Monotonic Relative Neighborhood Graph (MRNG) is proposed which guarantees very low search complexity (close to logarithmic time) and is proposed to further lower the indexing complexity and make it practical for billion-node ANNS problems.
A scalable solution to the nearest neighbor search problem through local-search methods on neighbor graphs
- Computer SciencePattern Anal. Appl.
- 2021
This paper introduces an algorithm to solve a nearest-neighbor query q by minimizing a kernel function defined by the distance from q to each object in the database, and provides two approaches to select edges in the graph's construction stage that limit memory footprint and reduce the number of free parameters simultaneously.
EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph
- Computer ScienceArXiv
- 2016
EFANNA is the fastest algorithm so far both on approximate nearest neighbor graph construction and approximate nearest neighbour search and Efanna nicely combines the advantages of hierarchical structure based methods and nearest-neighbor-graph based methods.
Hierarchical Clustering-Based Graphs for Large Scale Approximate Nearest Neighbor Search
- Computer SciencePattern Recognit.
- 2019
Satellite System Graph: Towards the Efficiency Up-Boundary of Graph-Based Approximate Nearest Neighbor Search
- Computer ScienceArXiv
- 2019
Inspired by the message transfer mechanism of the communication satellite system, a new family of MSNETs are found, namely the Satellite System Graphs (SSG), which inherits the superior ANNS properties from the MSNET and tries to ensure the angles between the edges to be no smaller than a given value.
Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination
- Computer ScienceSIGMOD Conference
- 2020
This work builds and train gradient boosting decision tree models to learn and predict when to stop searching for a certain query and applies the learned adaptive early termination to state-of-the-art ANN approaches, and evaluates the end-to-end performance on three million to billion-scale datasets.
Graph based Nearest Neighbor Search: Promises and Failures
- Computer Science
- 2019
The hierarchical structure could not achieve "much better logarithmic complexity scaling" as it was claimed in the original paper, particularly on high dimensional cases, and it is found that similar high search speed efficiency could be achieved with the support of flat k-NN graph after graph diversification.
Multiattribute approximate nearest neighbor search based on navigable small world graph
- Computer ScienceConcurr. Comput. Pract. Exp.
- 2020
A novel approach for multiattribute ANNS based on navigable small world (NSW) graph, called MA‐NSW, which guarantees efficiency and it is defined in terms of arbitrary metric spaces (eg, Euclidean distance and cosine similarity).
Query-driven iterated neighborhood graph search for large scale indexing
- Computer ScienceACM Multimedia
- 2012
This paper presents a criterion to check if the local search over a neighborhood graph arrives at the local solution, and follows the iterated local search (ILS) strategy, widely-used in combinatorial optimization, to find a solution beyond a local optimum.