Data mining and information retrieval

  title={Data mining and information retrieval},
  author={Selwyn Piramuthu and H. Michael Chung},
  journal={36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the},
  • S. Piramuthu, H. Chung
  • Published 7 August 2002
  • Computer Science
  • 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the
The minitrack covers the broad theory and application issues related to data mining, machine learning, knowledge acquisition, knowledge discovery, information retrieval, data base, and inductive decisionmaking. Both structured and unstructured data repositories including human expert decisions, environmental/normative datasets, large document collections, and web databases are considered. Theoretical and methodological exploration in the previous years motivates us to further investigate the… 
Research on knowledge retrieval by leveraging data mining techniques
  • Yan Hao, Yu-feng Zhang
  • Computer Science
    2010 International Conference on Future Information Technology and Management Engineering
  • 2010
In this model, data mining is integrated into the whole retrieval procedure of query optimizing, searching, results analyzing, and resources constructing, and significantly improves knowledge retrieval level and efficiency.
Privacy-preserving association rule mining based on electronic medical system
A new PPDARM scheme with less interactions is proposed to avert the shortcomings of Domadiya et al., using the homomorphic properties of the distributed Paillier cryptosystem to accomplish the cooperative computation.
Data Mining and Information Retrieval in the 21st century: A bibliographic review


Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases
An overview of the current state of the art in querying multimedia databases is provided, describing the index structures and algorithms for an efficient query processing in high-dimensional spaces.
Efficient Similarity Search In Sequence Databases
An indexing method for time sequences for processing similarity queries using R * -trees to index the sequences and efficiently answer similarity queries and provides experimental results which show that the method is superior to search based on sequential scanning.
Some approaches to best-match file searching
Three file structures are presented together with their corresponding search algorithms, which are intended to reduce the number of comparisons required to achieve the desired result.
Using Signature Files for Querying Time-Series Data
The principal idea of the proposed time-series data indexing method is to encode the shape of time- series into an alphabet of characters and then to treat them as text.
Proximity Matching Using Fixed-Queries Trees
This work presents a new data structure, called the fixed-queries tree, for the problem of finding all elements of a fixed set that are close to a query element under some distance function.
Using Dynamic Time Warping to Find Patterns in Time Series
Preliminary experiments with a dynamic programming approach to pattern detection in databases, based on the dynamic time warping technique used in the speech recognition field, are described.
Boosting Interval-Based Literals: Variable Length and Early Classification
A system for supervised time series classification, capable of learning from series of different length and able of providing a classification when only part of the series are presented to the classifier, and can be used to identify partial time series.
Pivot selection techniques for proximity searching in metric spaces
An efficiency measure to compare two pivot sets, combined with an optimization technique that allows selecting good sets of pivots is proposed and it is shown that good pivot sets are outliers, but that selecting outliers does not ensure that good pivots are selected.
Indexing spatio-temporal trajectories with Chebyshev polynomials
The Chebyshev polynomials are explored as a basis for approximating and indexing d-dimenstional trajectories and the key analytic result is the Lower Bounding Lemma, which shows that the Euclidean distance between two d-dimensional trajectories is lower bounded by the weighted Euclideans distance between the two vectors of ChebysHEv coefficients.
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
It is shown that it is possible to preprocess a set of data points in real D-dimensional space in O(kd) time and in additional space, so that given a query point q, the closest point of S to S to q can be reported quickly.