Learn More
—This paper investigates the discovery of conditional functional dependencies (CFDs). CFDs are a recent extension of functional dependencies (FDs) by supporting patterns of semantically related constants, and can be used as rules for cleaning relational data. However, finding quality CFDs is an expensive process that involves intensive manual effort. To(More)
Graph pattern matching is typically defined in terms of sub-graph isomorphism, which makes it an np-complete problem. Moreover, it requires bijective functions, which are often too restrictive to characterize patterns in emerging applications. We propose a class of graph patterns, in which an edge denotes the connectivity in a data graph within a predefined(More)
— The database research community has recently recognized the usefulness of skyline query. As an extension of existing database operator, the skyline query is valuable for multi-criteria decision making. However, current research tends to assume that the skyline operator is applied to one table which is not true for many applications on web databases. In(More)
—Data aggregation is an essential operation in wireless sensor network applications. This paper focuses on the data aggregation scheduling problem. Based on maximal independent sets, a distributed algorithm to generate a collision-free schedule for data aggregation in wireless sensor networks is proposed. The time latency of the aggregation schedule(More)
Providing scalable database services is an essential requirement for extending many existing applications of the Cloud platform. Due to the diversity of applications, database services on the Cloud must support large-scale data analytical jobs and high concurrent OLTP queries. Most existing work focuses on some specific type of applications. To provide an(More)
In this paper, we propose four peer-to-peer models for content-based music information retrieval (CBMIR) and carefully evaluate them on network load, retrieval time, system update and robustness qualitatively and quantitatively. And we bring forward an algorithm to improve the speed of CBP2PMIR and a simple but effective method to filter out the replica in(More)
This paper investigates constraints for matching records from unreliable data sources. (a) We introduce a class of matching dependencies (MDs) for specifying the semantics of unreliable data. As opposed to static constraints for schema design, MDs are developed for record matching, and are defined in terms of similarity predicates and a dynamic semantics.(More)