SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates

  title={SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates},
  author={Martin Engilberge and Louis Chevallier and Patrick P{\'e}rez and Matthieu Cord},
  journal={2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
Several tasks in machine learning are evaluated using non-differentiable metrics such as mean average precision or Spearman correlation. However, their non-differentiability prevents from using them as objective functions in a learning framework. Surrogate and relaxation methods exist but tend to be specific to a given metric. In the present work, we introduce a new method to learn approximations of such non-differentiable objective functions. Our approach is based on a deep architecture that… 

Figures and Tables from this paper

Recall@k Surrogate Loss with Large Batches and Similarity Mixup

This work focuses on learning deep visual representation models for retrieval by exploring the interplay between a new loss function, the batch size, and a new regularization approach by proposing a differentiable surrogate loss for the recall.

RankSim: Ranking Similarity Regularization for Deep Imbalanced Regression

RankSim is complementary to conventional imbalanced learning techniques, including re-weighting, two-stage training, and distribution smoothing, and lifts the state-of-the-art performance on three imbalanced regression benchmarks: IMDB-WIKI-dir, AgeDB-DIR, and STS-B-DIR.

NeuralNDCG: Direct Optimisation of a Ranking Metric via Differentiable Relaxation of Sorting

This paper proposes NeuralNDCG, a novel differentiable approximation to NDCG, a new ranking loss function which is an arbitrarily accurate approximation to the evaluation metric, thus closing the gap between the training and the evaluation of LTR models.

Learning to Rank for Active Learning: A Listwise Approach

This work defines the acquisition function as a learning to rank problem and rethink the structure of the loss prediction module, using a simple but effective listwise approach, and demonstrates that this method outperforms recent state-of-the-art active learning approaches for both image classification and regression tasks.

Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval

Smooth-AP is a plug-and-play objective function that allows for end-to-end training of deep networks with a simple and elegant implementation and improves the performance over the state-of-the-art, especially for larger-scale datasets, thus demonstrating the effectiveness and scalability of Smooth-AP to real-world scenarios.

MetricOpt: Learning to Optimize Black-Box Evaluation Metrics

This work achieves state-of-the-art performance on a variety of metrics for (image) classification, image retrieval and object detection by learning a differentiable value function, which maps compact task-specific model parameters to metric observations.

Optimizing Rank-Based Metrics With Blackbox Differentiation

This work presents an efficient, theoretically sound, and general method for differentiating rank-based metrics with mini-batch gradient descent, and addresses optimization instability and sparsity of the supervision signal that both arise from using rank- based metrics as optimization targets.

Learning Surrogates via Deep Embedding

The effectiveness of the proposed technique for training a neural network by minimizing a surrogate loss that approximates the target evaluation metric, which may be non-differentiable is demonstrated.

Robust and Decomposable Average Precision for Image Retrieval

This paper proposes a new differentiable approximation of the rank function, which provides an upper bound of the AP loss and ensures robust training and designs a simple yet effective loss function to reduce the decomposability gap between the AP in the whole training set and its averaged batch approximation.

A Unified Framework of Surrogate Loss by Refactoring and Interpolation

UniLoss, a unified framework to generate surrogate losses for training deep networks with gradient descent, reduces the amount of manual design of task-specific surrogate losses, achieving comparable performance compared with task- specific losses.



Structured learning for non-smooth ranking losses

This paper proposes new, almost-linear-time algorithms to optimize for two other criteria widely used to evaluate search systems: MRR (mean reciprocal rank) and NDCG (normalized discounted cumulative gain) in the max-margin structured learning framework.

Training Deep Neural Networks via Direct Loss Minimization

This paper proposes a direct loss minimization approach to train deep neural networks, which provably minimizes the application-specific loss function, and develops a novel dynamic programming algorithm that can efficiently compute the weight updates.

Learning to rank: from pairwise approach to listwise approach

It is proposed that learning to rank should adopt the listwise approach in which lists of objects are used as 'instances' in learning, and introduces two probability models, respectively referred to as permutation probability and top k probability, to define a listwise loss function for learning.

Learning to Rank Based on Subsequences

By exploiting sub-sequences, the proposed MidRank improves ranking accuracy considerably on an extensive array of image ranking applications and datasets.

VSE++: Improved Visual-Semantic Embeddings

This paper introduces a very simple change to the loss function used in the original formulation by Kiros et al. (2014), which leads to drastic improvements in the retrieval performance, and shows that similar improvements also apply to the Order-embeddings by Vendrov etAl.

VSE++: Improving Visual-Semantic Embeddings with Hard Negatives

A simple change to common loss functions used for multi-modal embeddings, inspired by hard negative mining, the use of hard negatives in structured prediction, and ranking loss functions, is introduced, which yields significant gains in retrieval performance.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Learning Two-Branch Neural Networks for Image-Text Matching Tasks

This paper investigates two-branch neural networks for learning the similarity between image-sentence matching and region-phrase matching, and proposes two network structures that produce different output representations.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

DeViSE: A Deep Visual-Semantic Embedding Model

This paper presents a new deep visual-semantic embedding model trained to identify visual objects using both labeled image data as well as semantic information gleaned from unannotated text and shows that the semantic information can be exploited to make predictions about tens of thousands of image labels not observed during training.