A Parallel Similarity Search in High Dimensional Metric Space Using M-Tree

  title={A Parallel Similarity Search in High Dimensional Metric Space Using M-Tree},
  author={Adil Alpkocak and Taner Danisman and Tuba Ulker},
In this study, parallel implementation of M-tree to index high dimensional metric space has been elaborated and an optimal declustering technique has been proposed. First, we have defined the optimal declustering and developed an algorithm based on this definition. Proposed declustering algorithm considers both object proximity and data load on disk/processors by executing a k-NN or a range query for each newly inserted objects. We have tested our algorithm in a database containing randomly… 
Parallel M-tree Based on Declustering Metric Objects using K-medoids Clustering
A new declustering data algorithm based on k-medoids clustering that can achieve the static and dynamic load balance of the multiple disks, and the parallel M-tree has a better performance of k-NN query than the sequential version.
Efficient Processing of Nearest Neighbor Queries in Parallel Multimedia Databases
A data allocation method which allows achieving a $0(\sqrt{n})$ query processing time in parallel settings is proposed, based on the complexity analysis of content based retrieval when it is used a clustering method.
Similarity search implementations for multi-core and many-core processors
Two new parallel implementations for range queries on Spaghettis data structures have been carried out: one of them on a many-core processor and the other one on a multi- core processor.
A Data Allocation Method for Efficient Content-Based Retrieval in Parallel Multimedia Databases
This paper proposes a data allocation method with an optimal number of clusters and nodes based on a complexity analysis of CBR and validated the method through experiments with different high dimensional synthetic databases and implemented a query processing algorithm for full k nearest neighbors.
Large-Scale Similarity-Based Join Processing in Multimedia Databases
This paper presents efficient parallelization strategies for processing large-scale multimedia database operations by extending and parallelizing the GiST (Generalized Search Tree)-framework and integrating the parallelized framework into an Oracle 11g Multimedia Database using its extension mechanisms.
A GPU-Based Implementation for Range Queries on Spaghettis Data Structure
This paper has adapted Spaghettis structure to GPU-based platform, showing significant improvements in terms of time reduction, obtaining values of speed-up close to 10.0%, and compared both sequential andGPU-based implementation to analyse the performance.
A Shared Memory Parallel k-NN Query Algorithm for M-tree
A shared memory parallel k-NN algorithm for M-tree index structure is introduced in this paper, which is called SMP k-nn, which takes full advantage of SMP architecture and can keep good load balancing between threads.
Efficiency and Scalability Issues in Metric Access Methods
This chapter explains and proves by experiments that similarity searching is typically an expensive process which does not easily scale to very large volumes of data, thus distributed architectures able to exploit parallelism must be employed.
Towards an efficient static scheduling scheme for delivering queries to heterogeneous clusters in the similarity search problem
This paper addresses the problem of how to distribute the queries over the whole system in order to obtain the best performance, under the assumption of defining a heuristic that automatically provides the best distribution.


M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
The results demonstrate that the Mtree indeed extends the domain of applicability beyond the traditional vector spaces, performs reasonably well in high-dimensional data spaces, and scales well in case of growing files.
Fast parallel similarity search in multimedia databases
This paper presents a new parallel method for fast nearest-neighbor search in high-dimensional feature spaces, which provides an almost linear speed-up and a constant scale-up, and outperforms the Hilbert approach by a factor of up to 5.
Similarity indexing with the SS-tree
  • David A. White, R. Jain
  • Computer Science
    Proceedings of the Twelfth International Conference on Data Engineering
  • 1996
This work describes the fundamental types of "similarity queries" that should be supported and proposes a new dynamic structure for similarity indexing called the similarity search tree or SS-tree, which performs better than the R*-tree in nearly every test.
R-trees: a dynamic index structure for spatial searching
A dynamic index structure called an R-tree is described which meets this need, and algorithms for searching and updating it are given and it is concluded that it is useful for current database systems in spatial applications.
The R*-tree: an efficient and robust access method for points and rectangles
The R*-tree is designed which incorporates a combined optimization of area, margin and overlap of each enclosing rectangle in the directory which clearly outperforms the existing R-tree variants.
  • Thomas de Quincey
  • Physics
    The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820
  • 2000
In supernova (SN) spectroscopy relatively little attention has been given to the properties of optically thick spectral lines in epochs following the photosphere’s recession. Most treatments and
Processing M-Tree with Parallel Resources
  • Proceedings of the 6th EDBT International Conference
  • 1998