Omar Benjelloun

Learn More
This paper introduces ULDBs, an extension of relational databases with simple yet expressive constructs for representing and manipulating both lineage and uncertainty. Uncertain data and data lineage are two important areas of data management that have been considered extensively in isolation, however many applications require the features in tandem.(More)
This chapter covers the Trio database management system. Trio is a robust prototype that supports uncertain data and data lineage, along with the standard features of a relational DBMS. Trio’s new ULDB data model is an extension to the relational model capturing various types of uncertainty along with data lineage, and its TriQL query language extends SQL(More)
We consider the entity resolution (ER) problem (also known as deduplication, or merge–purge), in which records determined to represent the same real-world entity are successively located and merged. We formalize the generic ER problem, treating the functions for comparing and merging records as black-boxes, which permits expressive and extensible ER(More)
This paper explores an inherent tension in modeling and querying uncertain data: simple, intuitive representations of uncertain data capture many application requirements, but these representations are generally incomplete―standard operations over the data may result in unrepresentable types of uncertainty. Complete models are theoretically(More)
This paper introduces uldbs, an extension of relational databases with simple yet expressive constructs for representing and manipulating both lineage and uncertainty. Uncertain data and data lineage are two important areas of data management that have been considered extensively in isolation, however many applications require the features in tandem.(More)
In this paper, we study query evaluation on Active XML documents (AXML for short), a new generation of XML documents that has recently gained popularity. AXML documents are XML documents whose content is given partly extensionally, by explicit data elements, and partly intensionally, by embedded calls to Web services, which can be invoked to generate data.A(More)
This paper provides an overview of the Active XML project developed at INRIA over the past five years. Active XML (AXML, for short), is a declarative framework that harnesses Web services for distributed data management, and is put to work in a peer-to-peer architecture. The model is based on AXML documents, which are XML documents that may contain embedded(More)
We propose a peer based architecture that al lows for the integration of distributed data and web services It relies on a language Ac tive XML where documents embed calls to web services that are used to enrich them and new web services may be de ned by XQuery queries on such active documents Embedding calls to functions or even to web services inside data(More)
XML is becoming the universal format for data exchange between applications. Recently, the emergence of Web services as standard means of publishing and accessing data on the Web introduced a new class of XML documents, which we call <i>intensional</i> documents. These are XML documents where some of the data is given explicitly while other parts are(More)