• Corpus ID: 14179183

Building and Maintaining Halls of Fame over a Database

  title={Building and Maintaining Halls of Fame over a Database},
  author={Foteini Alvanaki and Sebastian Michel and Aleksandar Stupar},
Halls of Fame are fascinating constructs. They represent the elite of an often very large amount of entities---persons, companies, products, countries etc. Beyond their practical use as static rankings, changes to them are particularly interesting---for decision making processes, as input to common media or novel narrative science applications, or simply consumed by users. In this work, we aim at detecting events that can be characterized by changes to a Hall of Fame ranking in an automated way… 
Interesting event detection through hall of fame rankings
The characteristics of entity rankings based on a set of rankings obtained from a popular Web portal, coined Pantheon, are studied, which maintains sets of top-k rankings and reports identified changes in a way that appeals to users.
Mining Entity Rankings
This paper proposes models, algorithms, and implementation details of an approach that extract the most relevant entity rankings from large datasets in a fully automated way and presents an overall scoring model to assess the meaningfulness of a ranking.


Automatic discovery of attributes in relational databases
This work designs algorithms for clustering relational columns into attributes based on the common properties and characteristics of the values they contain, and introduces data oriented solutions that use statistical measures to identify strong relationships between the values of a set of columns.
An efficient strategy for mining exceptions in multi-databases
Continuous monitoring of top-k queries over sliding windows
This paper presents two processing techniques: the first one computes the new answer of a query whenever some of the current top-k points expire; the second one partially pre-computes the future changes in the result, achieving better running time at the expense of slightly higher space requirements.
Top-k query processing in probabilistic databases with non-materialized views
This work is the first to address integrated data and confidence computations for intensional query evaluations in the context of probabilistic databases by considering confidence bounds over first-order lineage formulas and extends query processing techniques by a tool-suite of scheduling strategies based on selectivity estimation.
Maintenance of top-k materialized views
A principled method is provided that complements the inefficiency of the state of the art independently of the statistical properties of the data and the characteristics of the update streams and provides theoretical guarantees for the nucleation of a view with respect to another view and the reflection of this property to the management of updates.
Efficient Evaluation of Continuous Text Search Queries
This paper proposes the first solution for processing continuous text queries efficiently, indexes the streamed documents in main memory with a structure based on the principles of the inverted file, and processes document arrival and expiration events with an incremental threshold-based method.
Query Relaxation for Entity-Relationship Search
This paper describes comprehensive methods to relax SPARQL-like triplepattern queries in a fully automated manner and produces a set of relaxations by means of statistical language models for structured RDF data and queries.
Efficient maintenance of materialized top-k views
This work proposes an algorithm that reduces the frequency of refills by maintaining a top-k' view instead of aTop-k view, where k' changes at runtime between k and some k/sub max//spl ges/k, and shows that in most practical cases, the algorithm can reduce the expected amortized cost of refill queries to O(1) while still keeping the view small.
The gist of everything new: personalized top-k processing over web 2.0 streams
This work presents POL-filter, a system which continuously keeps the user updated with only the top-k relevant new information and shows by comprehensive performance evaluations using real world data, obtained from a weblog crawl, that the approach brings performance gains compared to state-of-the-art.
Active Database Systems
The event-condition-action (ECA) paradigm to specify reactive behavior and its enabling tools and triggers which are available in commercial DBMS are discussed and the need for formal reasoning about active database behavior is motivated.