Learn More
Graph pattern matching is typically defined in terms of sub-graph isomorphism, which makes it an np-complete problem. Moreover, it requires bijective functions, which are often too restrictive to characterize patterns in emerging applications. We propose a class of graph patterns, in which an edge denotes the connectivity in a data graph within a predefined(More)
Central to a data cleaning system are record matching and data repairing. Matching aims to identify tuples that refer to the same real-world object, and repairing is to make a database consistent by fixing errors in the data by using constraints. These are treated as separate processes in current data cleaning systems, based on heuristic solutions. This(More)
In a previous study, a quorum-sensing signaling system essential for genetic competence in Streptococcus mutans was identified, characterized, and found to function optimally in biofilms (Li et al., J. Bacteriol. 183:897-908, 2001). Here, we demonstrate that this system also plays a role in the ability of S. mutans to initiate biofilm formation. To test(More)
We deleted the hypoxia-responsive transcription factor HIF-1alpha in endothelial cells (EC) to determine its role during neovascularization. We found that loss of HIF-1alpha inhibits a number of important parameters of EC behavior during angiogenesis: these include proliferation, chemotaxis, extracellular matrix penetration, and wound healing. Most(More)
Despite the increasing importance of data quality and the rich theoretical and practical contributions in all aspects of data cleaning, there is no single end-to-end off-the-shelf solution to (semi-)automate the detection and the repairing of violations w.r.t. a set of heterogeneous and ad-hoc quality constraints. In short, there is no commodity platform(More)
A variety of integrity constraints have been studied for data cleaning. While these constraints can detect the presence of errors, they fall short of guiding us to correct the errors. Indeed, data repairing based on these constraints may not find certain fixes that are guaranteed correct, and worse still, may even introduce new errors when attempting to(More)
It is increasingly common to find graphs in which edges are of different types, indicating a variety of relationships. For such graphs we propose a class of reachability queries and a class of graph patterns, in which an edge is specified with a regular expression of a certain form, expressing the connectivity of a data graph via edges of various types. In(More)
We study the problem of answering XPATH queries using multiple materialized views. Despite the efforts on answering queries using single materialized view, answering queries using multiple views remains relatively new. We address two important aspects of this problem: multiple-view selection and equivalent multiple-view rewriting. With regards to the first(More)
This paper investigates the problem of incremental detection of errors in distributed data. Given a distributed database D, a set \Sigma of conditional functional dependencies (CFDs), the set V of violations of the CFDs in D, and updates \Delta D to D, it is to find, with minimum data shipment, changes \Delta V to V in response to \Delta D. The need for the(More)
The basic idea behind parallel database systems is to perform operations in parallel to reduce the response time and improve the system throughput. Data placement is a key factor on the overall performance of parallel systems. XML is semistructured data, traditional data placement strategies cannot serve it well. In this paper, we present the concept of(More)