Learn More
The Web has been rapidly "deepened" by the prevalence of databases online. With the potentially unlimited information hidden behind their query interfaces, this "deep Web" of searchable databses is clearly an important frontier for data access. This paper surveys this relatively unexplored frontier, measuring characteristics pertinent to both exploring and(More)
This paper introduces RankSQL, a system that provides a systematic and principled framework to support efficient evaluations of ranking (<i>top-k</i>) queries in relational database systems (RDBMS), by extending relational algebra and query optimization. Previously, <i>top-k</i> query processing is studied in the middleware scenario or in RDBMS in a(More)
We formulate and investigate the novel problem of finding the <i>skyline k-tuple groups</i> from an <i>n</i>-tuple dataset - i.e., groups of <i>k</i> tuples which are not dominated by any other group of equal size, based on aggregate-based group dominance relationship. The major technical challenge is to identify effective anti-monotonic properties for(More)
This paper presents a principled framework for efficient processing of ad-hoc <i>top-k</i> (ranking) aggregate queries, which provide the <i>k</i> groups with the highest aggregates as results. Essential support of such queries is lacking in current systems, which process the queries in a na&#239;ve materialize-group-sort scheme that can be prohibitively(More)
This paper proposes Facetedpedia, a faceted retrieval system for information discovery and exploration in Wikipedia. Given the set of Wikipedia articles resulting from a keyword query, Facetedpedia generates a faceted interface for navigating the result articles. Compared with other faceted retrieval systems, Facetedpedia is fully automatic and dynamic in(More)
The Boolean semantics of SQL queries cannot adequately capture the "fuzzy" preferences and "soft" criteria required in non-traditional data retrieval applications. One way to solve this problem is to add a flavor of "<i>information retrieval</i>" into database queries by allowing fuzzy query conditions and flexibly supporting <i>grouping</i> and(More)
Wikipedia is the largest user-generated knowledge base. We propose a structured query mechanism, <i>entity-relationship query</i>, for searching entities in Wikipedia corpus by their properties and inter-relationships. An entity-relationship query consists of arbitrary number of predicates on desired entities. The semantics of each predicate is specified(More)
We introduce EntityEngine, a system for answering entity-relationship queries over text. Such queries combine SQL-like structures with IR-style keyword constraints and therefore, can be expressive and flexible in querying about entities and their relationships. EntityEngine consists of various offline and online components, including a position-based(More)