Timely and cost-effective analytics over “Big Data” is now a key ingredient for success in many businesses, scientific and en gin ering disciplines, and government endeavors. The Hadoop soft ware… (More)
We present a method of generating test sequences for concurrent programs and communication protocols that are modeled as communicating nondeterministic finite state machines (CNFSMs). A conformance… (More)
Many modern software systems provide progress indicators for long-running tasks. These progress indicators make systems more user-friendly by helping the user quickly estimate how much of the task… (More)
Extracting named entities in text and linking extracted names to a given knowledge base are fundamental tasks in applications for text understanding. Existing systems typically run a named entity… (More)
21st International Conference on Data Engineering…
2005
Recently, progress indicators have been proposed for long-running SQL queries in RDBMSs. Although the proposed techniques work well for a subset of SQL queries, they are preliminary in the sense that… (More)
In a document streaming environment, online detection of the first documents that mention previously unseen events is an open challenge. For this online new event detection (ONED) task, existing… (More)
Extracting named entities in text and linking extracted names to a given knowledge base are fundamental tasks in applications for text understanding. Existing systems typically run a named entity… (More)
In this paper, the fitness of estimating vessel profiles with Gaussian function is evaluated and an amplitude-modified second-order Gaussian filter is proposed for the detection and measurement of… (More)
Recently, Haas and Hellerstein proposed the hash ripple join algorithm in the context of online aggregation. Although the algorithm rapidly gives a good estimate for many join-aggregate problem… (More)