• Corpus ID: 218516683

Interpretable Learning-to-Rank with Generalized Additive Models

  title={Interpretable Learning-to-Rank with Generalized Additive Models},
  author={Honglei Zhuang and Xuanhui Wang and Michael Bendersky and Alexander Grushetsky and Yonghui Wu and Petr Mitrichev and Ethan Sterling and Nathan Bell and Walker Ravina and Hai Qian},
Interpretability of learning-to-rank models is a crucial yet relatively under-examined research area. Recent progress on interpretable ranking models largely focuses on generating post-hoc explanations for existing black-box ranking models, whereas the alternative option of building an intrinsically interpretable ranking model with transparent and self-explainable structure remains unexplored. Developing fully-understandable ranking models is necessary in some scenarios (e.g., due to legal or… 

Figures and Tables from this paper

Learning Representations for Axis-Aligned Decision Forests through Input Perturbation

A novel but intuitive proposal to achieve representation learning for decision forests without imposing new restrictions or necessitating structural changes, and that is applicable to any arbitrary decision forest and that it allows the use of arbitrary deep neural networks for representation learning.

Learning to rank from relevance judgments distributions

Overall, it is observed that relying on relevance judgments distributions to train different LETOR models can boost their performance and even outperform strong baselines such as LambdaMART on several test collections.

An Alternative Cross Entropy Loss for Learning-to-Rank

This work proposes a cross entropy-based learning-to-rank loss function that is theoretically sound, is a convex bound on NDCG—a popular ranking metric—and is consistent with N DCG under learning scenarios common in information retrieval.

Lightweight Composite Re-Ranking for Efficient Keyword Search with BERT

BECR (BERT-based Composite Re-Ranking), a lightweight composite re-ranking scheme that combines deep contextual token interactions and traditional lexical term-matching features, is presented and an evaluation of relevance and efficiency of BECR with several TREC datasets is described.

Rankings for Two-Sided Market Platforms

Rankings have become the standard interface for presenting results to customers in online systems. Traditional online systems connect customers with items (e.g. books, music, news), where only the

(Discussion Paper)

It is shown how data descriptions—a set of compact, readable and insightful formulas of boolean predicates—can be used to guide domain experts in understanding and evaluating the results of entity matching processes.

Exposing Query Identification for Search Transparency

This work explores the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems: dense dual-encoder models and traditional BM25.

The accuracy versus interpretability trade-off in fraud detection model

A state-of-the-art review sheds some light on a technology race between black box machine learning models improved by post-hoc interpretation and intrinsic interpretable models boosted to gain accuracy in this technological race.



Posthoc Interpretability of Learning to Rank Models using Secondary Training Data

This paper operates on the notion of interpretability based on explainability of rankings over an interpretable feature space, and attempts to study how well does a subset of features, potentially interpretable, explain the full model under different training sizes and algorithms.

Axiomatic Interpretability for Multiclass Additive Models

A state-of-the-art GAM learning algorithm based on boosted trees is generalized to the multiclass setting, showing that this multiclass algorithm outperforms existing GAM learning algorithms and sometimes matches the performance of full complexity models such as gradient boosted trees.

A Unified Approach to Interpreting Model Predictions

A unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), which unifies six existing methods and presents new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

LIRME: Locally Interpretable Ranking Model Explanation

This work explores three sampling methods to train an explanation model and proposes two metrics to evaluate explanations generated for an IR model, revealing that diversity in samples is important for training local explanation models, and the stability of a model is inversely proportional to the number of parameters used to explain the model.

TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank

This work introduces TensorFlow Ranking, the first open source library for solving large-scale ranking problems in a deep learning framework, which is highly configurable and provides easy-to-use APIs to support different scoring mechanisms, loss functions and evaluation metrics in the learning- to-rank setting.

EXS: Explainable Search Using Local Model Agnostic Interpretability

ExS is a search system designed specifically to provide its users with insight into the following questions: "What is the intent of the query according to the ranker?'', "Why is this document ranked higher than another?'' and "Why was this document relevant to the query?''.

Revisiting Approximate Metric Optimization in the Age of Deep Neural Networks

This study revisits the approximation framework originally proposed by Qin et al. in light of recent advances in neural networks and hopes to show that the ideas from that work are more relevant than ever and can lay the foundation of learning-to-rank research in the age of deep neural networks.

Manipulating and Measuring Model Interpretability

A sequence of pre-registered experiments showed participants functionally identical models that varied only in two factors commonly thought to make machine learning models more or less interpretable: the number of features and the transparency of the model (i.e., whether the model internals are clear or black box).

“Why Should I Trust You?”: Explaining the Predictions of Any Classifier

LIME is proposed, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning aninterpretable model locally varound the prediction.

Intelligible models for classification and regression

This study studies the performance of generalized additive models (GAMs), which combine single-feature models called shape functions through a linear function, and presents the first large-scale empirical comparison of existing methods for learning GAMs.