An Indexing Approach for Representing Multimedia Objects in High-Dimensional Spaces Based on Expectation Maximization Algorithm

@inproceedings{Boccignone2005AnIA,
  title={An Indexing Approach for Representing Multimedia Objects in High-Dimensional Spaces Based on Expectation Maximization Algorithm},
  author={Giuseppe Boccignone and Vittorio Caggiano and Carmine Cesarano and Vincenzo Moscato and Lucio Sansone},
  booktitle={Multimedia Information Systems},
  year={2005}
}
In this paper we introduce a new indexing approach to representing multimedia object classes generated by the Expectation Maximization clustering algorithm in a balanced and dynamic tree structure. To this aim the EM algorithm has been modified in order to obtain at each step of its recursive application balanced clusters. In this manner our tree provides a simple and practical solution to index clustered data and support efficient retrieval of the nearest neighbors in high dimensional object… 

References

SHOWING 1-10 OF 21 REFERENCES
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases
TLDR
An overview of the current state of the art in querying multimedia databases is provided, describing the index structures and algorithms for an efficient query processing in high-dimensional spaces.
Content-Based Indexing of Multimedia Databases
  • Jian-Kang Wu
  • Computer Science
    IEEE Trans. Knowl. Data Eng.
  • 1997
TLDR
ContIndex, the context-based indexing technique presented in this paper, is proposed to meet challenges and special requirements of content-basedindexing and brings into the index the capability of self-organizing nodes with respect to certain context and frames of reference.
ClusterTree: Integration of Cluster Representation and Nearest-Neighbor Search for Large Data Sets with High Dimensions
TLDR
The ClusterTree provides a practical solution to index clustered data sets and supports the retrieval of the nearest-neighbors effectively without having to linearly scan the high-dimensional data set.
A Data Structure and an Algorithm for the Nearest Point Problem
TLDR
A tree structure for storing points from a normed space whose norm is effectively computable and an algorithm for finding the nearest point from the tree to a given query point is given.
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
TLDR
The results demonstrate that the Mtree indeed extends the domain of applicability beyond the traditional vector spaces, performs reasonably well in high-dimensional data spaces, and scales well in case of growing files.
Clustering large datasets in arbitrary metric spaces
TLDR
Two scalable algorithms designed for clustering very large datasets in distance spaces are presented, one of which is, to the authors' knowledge, the first scalable clustering algorithm for data in a distance space and the second improves upon BUBBLE by reducing the number of calls to the distance function, which may be computationally very expensive.
Data structures and algorithms for nearest neighbor search in general metric spaces
TLDR
The up-tree (vantage point tree) is introduced in several forms, together‘ with &&ciated algorithms, as an improved method for these difficult search problems in general metric spaces.
Near Neighbor Search in Large Metric Spaces
TLDR
A data structure to solve the problem of finding approximate matches in a large database called a GNAT { Geometric Near-neighbor Access Tree} is introduced based on the philosophy that the data structure should act as a hierarchical geometrical model of the data as opposed to a simple decomposition of theData that does not use its intrinsic geometry.
Searching in metric spaces
TLDR
A unified view of all the known proposals to organize metric spaces, so as to be able to understand them under a common framework, and presents a quantitative definition of the elusive concept of "intrinsic dimensionality".
Proximity Matching Using Fixed-Queries Trees
TLDR
This work presents a new data structure, called the fixed-queries tree, for the problem of finding all elements of a fixed set that are close to a query element under some distance function.
...
...