Learn More
It is infeasible for a sensor database to contain the exact value of each sensor at all points in time. This uncertainty is inherent in these systems due to measurement and sampling errors, and resource limitations. In order to avoid drawing erroneous conclusions based upon stale data, the use of uncertainty intervals that model each data item as a range(More)
— Given a set D = {d 1 , d 2 , ..., d D } of D strings of total length n, our task is to report the " most relevant " strings for a given query pattern P. This involves somewhat more advanced query functionality than the usual pattern matching, as some notion of " most relevant " is involved. In information retrieval literature, this task is best achieved(More)
We analyze an architecture based on mobility to address the problem of energy efficient data collection in a sensor network. Our approach exploits mobile nodes present in the sensor field as forwarding agents. As a mobile node moves in close proximity to sensors, data is transferred to the mobile node for later depositing at the destination. We present an(More)
Ranking is an important property that needs to be fully supported by current relational query engines. Recently, several rank-join query operators have been proposed based on rank aggregation algorithms. Rank-join operators progressively rank the join results while performing the join operation. The new operators have a direct impact on traditional query(More)
Let D ={d1, d2, ...dD} be a given set of D string documents of total length n, our task is to index D, such that the k most relevant documents for an online query pattern P of length p can be retrieved efficiently. We propose an index of size |CSA| + n log D(2 + o(1)) bits and O(ts(p)+k log log n+poly log log n) query time for the basic relevance metric(More)
the cleansed value directly is highly desirable. Data cleansing applications often result in uncertainty in Uncertainty in categorical data is commonplace in many the "cleaned" value of an attribute. Many cleansing tools applications, including data cleaning, database integration, provide alternative corrections with associated likelihood. and biological(More)
We introduce a new variant of the popular Burrows-Wheeler transform (BWT) called Geometric Burrows-Wheeler Transform (GBWT). Unlike BWT, which merely permutes the text, GBWT converts the text into a set of points in 2-dimensional geometry. Using this transform, we can answer to many open questions in compressed text indexing: (1) Can compressed data(More)
Orion is a state-of-the-art uncertain database management system with built-in support for probabilistic data as first class data types. In contrast to other uncertain databases, Orion supports both attribute and tuple uncertainty with arbitrary correlations. This enables the database engine to handle both discrete and continuous pdfs in a natural and(More)
Rank-aware query processing has emerged as a key requirement in modern applications. In these applications, efficient and adaptive evaluation of top-<i>k</i> queries is an integral part of the application semantics. In this article, we introduce a rank-aware query optimization framework that fully integrates rank-join operators into relational query(More)
A new trend in the field of pattern matching is to design indexing data structures which take space very close to that required by the indexed text (in entropy-compressed form) and also simultaneously achieve good query performance. Two popular indexes, namely the FM-index [Ferragina and Manzini, 2005] and the CSA [Grossi and Vitter 2005], achieve this goal(More)