A similarity measure for indefinite rankings

  title={A similarity measure for indefinite rankings},
  author={William Webber and Alistair Moffat and Justin Zobel},
  journal={ACM Trans. Inf. Syst.},
Ranked lists are encountered in research and daily life and it is often of interest to compare these lists even when they are incomplete or have only some members in common. An example is document rankings returned for the same query by different search engines. A measure of the similarity between incomplete rankings should handle nonconjointness, weight high ranks more heavily than low, and be monotonic with increasing depth of evaluation; but no measure satisfying all these criteria currently… 
Offline Evaluation by Maximum Similarity to an Ideal Ranking
This work proposes a radical simplification of NDCG to replace it, and proposes rank biased overlap (RBO) to compute this rank similarity, since it was specifically created to address the requirements of rank similarity between search results.
A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference
  • Luchen Tan, C. Clarke
  • Psychology, Computer Science
    IEEE Transactions on Knowledge and Data Engineering
  • 2015
A family of rank similarity measures, each derived from an associated effectiveness measure, based on the maximization of effectiveness difference under this associated measure, is proposed and validated.
Offline Evaluation without Gain
It is demonstrated that compatibility can replace and extend current offline evaluation measures that depend on fixed relevance grades that must be mapped to gain values, such as NDCG.
Consensus measure of rankings
This paper introduces a novel approach for consensus measure of rankings by using graph representation, in which the vertices or nodes are the items and the edges are the relationship of items in the rankings.
Dissimilarity Based Query Selection for Efficient Preference Based IR Evaluation
A way to measure the dissimilarity between two sides in side-by-side evaluation experiments is proposed and it is shown how this measure can be used to prioritize queries to be judged in an offline setting.
Evaluation Measures Based on Preference Graphs
This work proposes an evaluation measure that computes the similarity between a directed multigraph of preferences and an actual ranking generated by a ranker, and employs Rank Biased Overlap which was explicitly created to match the requirements of search and related applications.
A Weighted Correlation Index for Rankings with Ties
This work proposes to extend Kendall's definition of correlation in a natural way to take into account weights in the presence of ties and proves the usefulness of the weighted measure of correlation using experimental data on social networks and web graphs.
Google, bing and a new perspective on ranking similarity
This paper proposes a framework to assess the information presented in the first page by measuring the information entropy and the correlations between two ranks, and extends the recently proposed Rank-Biased Overlap measure and proposes a measure for comparing theInformation entropy present in two ranks.
Term-frequency surrogates in text similarity computations
This paper shows that the term frequency component of each posting can be completely replaced by a surrogate that allows skipping of positional information interleaved in inverted lists, and obtain significant speedups in ranked query execution without increasing the index size, and without harming retrieval effectiveness.
Estimating Measurement Uncertainty for Information Retrieval Effectiveness Metrics
RBP-derived residuals are used to re-examine the reliability of a typical way of building test collections for offline measurement of information retrieval systems and recommend that all such experimental results should report the residual measurements generated by a suitably matched weighted-precision metric to give a clear indication of measurement uncertainty that arises due to the presence of unjudged documents in test collections with finite pooled judgments.


Rank-biased precision for measurement of retrieval effectiveness
A new effectiveness metric, rank-biased precision, is introduced that is derived from a simple model of user behavior, is robust if answer rankings are extended to greater depths, and allows accurate quantification of experimental uncertainty, even when only partial relevance judgments are available.
Cumulated gain-based evaluation of IR techniques
This article proposes several novel measures that compute the cumulative gain the user obtains by examining the retrieval result up to a given ranked position, and test results indicate that the proposed measures credit IR methods for their ability to retrieve highly relevant documents and allow testing of statistical significance of effectiveness differences.
A new rank correlation coefficient for information retrieval
A new rank correlation coefficient, AP correlation (Τap), is proposed that is based on average precision and has a probabilistic interpretation and is shown to give more weight to the errors at high rankings and has nice mathematical properties which make it easy to interpret.
A measure of top-down correlation
Many situations exist in which n objects are ranked by two or more independent sources, where interest centers primarily on agreement in the top rankings and disagreements on items at the bottom of
Weighted Rank Correlation in Information Retrieval Evaluation
A family *** * of rank correlation coefficients for IR has been introduced for the rank correlation according to the rank of the items, provided by the notion of gain previously utilized in retrieval effectiveness measurement.
Topic prediction based on comparative retrieval rankings
It is shown that AnchorMap scores, when run on a set of initial ranked document lists from 8 different systems, are very highly correlated with categorization of topics as easy or hard, and separately, arehighly correlated with those topics on which blind feedback works.
On rank correlation and the distance between rankings
This work introduces an alternative measure of distance between rankings that corrects this by explicitly accounting for correlations between systems over a sample of topics, and moreover has a probabilistic interpretation for use in a test of statistical significance.
Comparing rankings of search results on the Web
On rank correlation in information retrieval evaluation
The paper then focuses on rank correlation between webpage lists ordered by PageRank for applying the general reflections on these test statistics and an interpretation of PageRank behaviour is provided.
Comparing top k lists
Besides the applications to the task of identifying good notions of (dis-)similarity between two top k lists, the results imply polynomial-time constant-factor approximation algorithms for the rank aggregation problem with respect to a large class of distance measures.