Learn More
Large-scale data analysis lies in the core of modern enterprises and scientific research. With the emergence of cloud computing, the use of an analytical query processing infrastructure (e.g., Amazon EC2) can be directly mapped to monetary value. MapReduce has been a popular framework in the context of cloud computing, designed to serve long running queries(More)
Large-scale data analysis lies in the core of modern enterprises and scientific research. With the emergence of cloud computing, the use of an analytical query processing infrastructure can be directly associated with monetary cost. MapReduce has been a popular framework in the context of cloud computing, designed to serve long-running queries (jobs) which(More)
Word meaning disambiguation has always been an important problem in many computer science tasks, such as information retrieval and extraction. One of the problems, faced in automatic word sense discovery, is the number of different senses a word can have. Often, senses are dominated by some other, more frequent ones. Discovering such dominated meanings can(More)
Similarity search over time-series data is a useful, but expensive , application. Sequence data can be transformed via an orthonormal transformation, which is then dimensionally reduced, to be indexed. R-trees and R*-trees have been used for this purpose. These do not scale well, however, for very large datasets. We propose a new indexing data-structure(More)
  • 1