Benny Kimelfeld

Learn More
Various known models of probabilistic XML can be represented as instantiations of abstract <i>p-documents</i>. Such documents have, in addition to ordinary nodes, <i>distributional</i> nodes that specify the probabilistic process of generating a random document. Within this abstraction, families of pdocuments, which are natural extensions and combinations(More)
In keyword search over data graphs, an answer is a nonredundant subtree that includes the given keywords. An algorithm for enumerating answers is presented within an architecture that has two main components: an <i>engine</i> that generates a set of candidate answers and a <i>ranker</i> that evaluates their score. To be effective, the engine must have three(More)
Query evaluation over probabilistic XML is explored. The queries are twig patterns with projection, and the data is represented in terms of three models of probabilistic XML (that extend existing ones in the literature). The first model makes an assumption of independence among the probabilistic junctions, whereas the second model can encode probabilistic(More)
Evaluation of twig queries over probabilistic XML is investigated. Projection is allowed and, in particular, a query may be Boolean. It is shown that for a well-known model of probabilistic XML, the evaluation of twigs with projection is tractable under data complexity (whereas in other probabilistic data models, projection is intractable). Under(More)
Various approaches for keyword proximity search have been implemented in relational databases, XML and the Web. Yet, in all of them, an answer is a <i>Q</i>-fragment, namely, a subtree <i>T</i> of the given data graph <i>G</i>, such that <i>T</i> contains all the keywords of the query <i>Q</i> and has no proper subtree with this property. The rank of an(More)
Various known models of probabilistic XML can be represented as instantiations of the abstract notion of p-documents. In addition to ordinary nodes, p-documents have distributional nodes that specify the possible worlds and their probabilistic distribution. Particular families of p-documents are determined by the types of distributional nodes that can be(More)
Constraints are important not just for maintaining data integrity, but also because they capture natural probabilistic dependencies among data items. A <i>probabilistic XML database</i> (PXDB) is the probability sub-space comprising the instances of a <i>p-document</i> that satisfy a set of constraints. In contrast to existing models that can express(More)
A framework for describing semantic relationships among nodes in XML documents is presented. In contrast to earlier work, the XML documents may have ID references (i.e., they correspond to graphs and not just trees). A specific <i>interconnection semantics</i> in this framework can be defined explicitly or derived automatically. The main advantage of(More)