DisC diversity: result diversification based on dissimilarity and coverage

@article{Drosou2012DisCDR,
  title={DisC diversity: result diversification based on dissimilarity and coverage},
  author={Marina Drosou and Evaggelia Pitoura},
  journal={Proc. VLDB Endow.},
  year={2012},
  volume={6},
  pages={13-24}
}
Recently, result diversification has attracted a lot of attention as a means to improve the quality of results retrieved by user queries. In this paper, we propose a new, intuitive definition of diversity called DisC diversity. A DisC diverse subset of a query result contains objects such that each object in the result is represented by a similar object in the diverse subset and the objects in the diverse subset are dissimilar to each other. We show that locating a minimum DisC diverse subset… 
Multiple Radii DisC Diversity: Result Diversification Based on Dissimilarity and Coverage
TLDR
This article introduces a novel definition of diversity called DisC diversity, and extends its definition to the multiple radii case, where each item is associated with a different radius based on its importance, relevance, or other factors.
Towards both Local and Global Query Result Diversification
TLDR
This paper formally defines the metrics of global diversity and global-and-local diversity, and proposes two heuristic algorithms, greedy search and vertex substitution, and sophisticated optimization techniques to solve the problems efficiently.
RC-Index: Diversifying Answers to Range Queries
TLDR
The RC-Index is proposed, a novel index structure that achieves efficiency by reducing the number of items that must be retrieved by the database to form a diverse set of the desired size (about 1 second for a dataset of 1 million items).
Parameter-free and domain-independent similarity search with diversity
TLDR
The "Better Results with Influence Diversification" (BRID) technique is the basis to the k-Diverse Nearest Neighbor and the Range Diverse algorithms, which execute k-nearest neighbor and range queries with diversification, showing that the technique can be applied to diversify any type of similarity queries.
Diversity in Similarity Joins
TLDR
The concept of diverse similarity joins is introduced: a similarity join operator that ensures a smaller, more diversified and useful answers, and allows exploiting diversity in similarity joins without diminish their performance whereas providing elements that cover the same data space distribution of the non-diverse answers.
Similarity Search Combining Query Relaxation and Diversification
TLDR
The similarity search problem which aims to find the similar query results according to a set of given data and a query string is studied and a novel goal function combining similarity and diversity is defined.
Coverage-Oriented Diversification of Keyword Search Results on Graphs
TLDR
This paper reasonably formalizes a problem of coverage-oriented diversified keyword search on graphs and presents a search algorithm that guarantees to return the optimal diverse result set, and can eliminate unnecessary and redundant diversity computation.
A survey of query result diversification
TLDR
This survey aims to provide a thorough review of a wide range of result diversification techniques including various definitions of diversifications, corresponding algorithms, diversification technique specified for some applications including database, search engines, recommendation systems, graphs, time series and data streams as well as result diversify systems.
Preferential Diversity
TLDR
This paper proposes a novel framework called Preferential Diversity (PrefDiv) that aims to support both relevancy and diversity of user query results and describes an implementation of PrefDiv on top of the HYPRE preference model, which allows users to specify both qualitative and quantitative preferences and unifies them using the concept of preference intensities.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
Dynamic diversification of continuous data
TLDR
This paper exploits the dynamic case in which the result set changes over time, as for example, in the case of notification services, and defines the Continuous k-Diversity Problem along with appropriate constraints that enforce continuity requirements on the diversified results.
Incremental diversification for very large sets: a streaming-based approach
TLDR
This work presents a novel diversification approach which treats the input as a stream and processes each element in an incremental fashion, maintaining a near-optimal diverse set at any point in the stream, without significant loss of diversification quality.
Top-k bounded diversification
TLDR
This paper introduces Space Partitioning and Probing (SPP), an algorithm that minimizes the number of accessed objects while finding exactly the same result as MMR, the most popular diversification algorithm.
On query result diversification
TLDR
A general framework for evaluation and optimization of methods for diversifying query results is described, and the first thorough experimental evaluation of the various diversification techniques implemented in a common framework is presented.
Efficient diversity-aware search
TLDR
This work proposes DIVGEN, an efficient algorithm for diversity-aware search, which achieves significant performance improvements via novel data access primitives, and devise the first low-overhead data access prioritization scheme with theoretical quality guarantees, and good performance in practice.
Providing Diversity in K-Nearest Neighbor Query Results
TLDR
This paper proposes a user-tunable definition of diversity, and presents an algorithm, called MOTLEY, for producing a diverse result set as per this definition, and shows that MOTLEY can produce diverse result sets by reading only a small fraction of the tuples in the database.
Diversifying search results
TLDR
This work proposes an algorithm that well approximates this objective in general, and is provably optimal for a natural special case, and generalizes several classical IR metrics, including NDCG, MRR, and MAP, to explicitly account for the value of diversification.
Efficient Computation of Diverse Query Results
TLDR
A key contribution of this paper is to formally define the notion of diversity, and to show that existing score based techniques commonly used in web applications are not sufficient to guarantee diversity.
Max-Sum diversification, monotone submodular functions and dynamic updates
TLDR
This paper considers the setting where the authors are given a set of elements in a metric space and a set valuation function f defined on every subset and shows that a natural single swap local search algorithm provides a 2-approximation in this more general setting.
Similarity Search - The Metric Space Approach
TLDR
Similarity Search focuses on the state of the art in developing index structures for searching the metric space, and provides an extensive survey of specific techniques for a large range of applications.
...
1
2
3
...