AM-LFS: AutoML for Loss Function Search

  title={AM-LFS: AutoML for Loss Function Search},
  author={Chuming Li and Chen Lin and Minghao Guo and Wei Wu and Wanli Ouyang and Junjie Yan},
  journal={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
  • Chuming Li, Chen Lin, Junjie Yan
  • Published 17 May 2019
  • Computer Science
  • 2019 IEEE/CVF International Conference on Computer Vision (ICCV)
Designing an effective loss function plays an important role in visual analysis. [] Key Method We also propose an efficient optimization framework which can dynamically optimize the parameters of loss function's distribution during training. Extensive experimental results on four benchmark datasets show that, without any tricks, our method outperforms existing hand-crafted loss functions in various computer vision tasks.

Figures and Tables from this paper


This paper proposes to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric by substitute the non-differentiable operations in the metrics with parameterized functions, and conduct parameter search to optimize the shape of loss surfaces.

Loss Function Search for Face Recognition

This paper defines a novel search space and develops a reward-guided search method to automatically obtain the best candidate for face recognition losses, and demonstrates the effectiveness of the method over the state-of-the-art alternatives.

AutoLoss-Zero: Searching Loss Functions from Scratch for Generic Tasks

This paper proposes AutoLoss-Zero, which is a general framework for searching loss functions from scratch for generic tasks, and designs an elementary search space composed only of primitive mathematical operators to accommodate the het-erogeneous tasks and evaluation metrics.


This work makes the first attempt to discover new loss functions for the challenging object detection from primitive operation levels and finds the searched losses are insightful.

Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation

This paper proposes to automate the design of metric-specific loss functions by searching differentiable surrogate losses for each metric by substitute the non-differentiable operations in the metrics with parameterized functions, and conduct parameter search to optimize the shape of loss surfaces.

PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions

The key insight is to decompose commonly used classification loss functions, such as cross-entropy loss and focal loss, into a series of weighted polynomial bases, which enables the importance of different bases to be easily ad-justed depending on the targeting tasks and datasets.

Stochastic Loss Function

Stochastic Loss Function (SLF) is developed to dynamically and automatically generating appropriate gradients to train deep networks in the same round of back-propagation, while maintaining the completeness and differentiability of the training pipeline.

Towards Robust Face Recognition with Comprehensive Search

This paper finds that the optimal model architecture or loss function is closely coupled with the data cleaning and points out that strong models tend to optimize with more difficult training datasets and loss functions.

Joint Search of Data Augmentation Policies and Network Architectures

The proposed method combines differentiable methods for augmentation policy search and network architecture search to jointly optimize them in the end-to-end manner and achieves competitive or superior performance to the independently searched results.

Auto-MVCNN: Neural Architecture Search for Multi-view 3D Shape Recognition

This paper proposes a neural architecture search method named Auto-MVCNN which is particularly designed for optimizing architecture in multi-view 3D shape recognition and develops an end-to-end scheme to enhance retrieval performance through the trade-off parameter search.



Practical Block-Wise Neural Network Architecture Generation

A block-wise network generation pipeline called BlockQNN which automatically builds high-performance networks using the Q-Learning paradigm with epsilon-greedy exploration strategy and offers tremendous reduction of the search space in designing networks which only spends 3 days with 32 GPUs.

Regularized Evolution for Image Classifier Architecture Search

This work evolves an image classifier---AmoebaNet-A---that surpasses hand-designs for the first time and gives evidence that evolution can obtain results faster with the same hardware, especially at the earlier stages of the search.

DARTS: Differentiable Architecture Search

The proposed algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques.

Learning Transferable Architectures for Scalable Image Recognition

This paper proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset and introduces a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models.

Harmonious Attention Network for Person Re-identification

  • Wei LiXiatian ZhuS. Gong
  • Computer Science
    2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
  • 2018
A novel Harmonious Attention CNN (HA-CNN) model is formulated for joint learning of soft pixel attention and hard regional attention along with simultaneous optimisation of feature representations, dedicated to optimise person re-id in uncontrolled (misaligned) images.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

IRLAS: Inverse Reinforcement Learning for Architecture Search

An inverse reinforcement learning method for architecture search (IRLAS), which trains an agent to learn to search network structures that are topologically inspired by human-designed network to extract the abstract topological knowledge of an expert human-design network (ResNeXt).

SphereFace: Deep Hypersphere Embedding for Face Recognition

This paper proposes the angular softmax (A-Softmax) loss that enables convolutional neural networks (CNNs) to learn angularly discriminative features in deep face recognition (FR) problem under open-set protocol.

Focal Loss for Dense Object Detection

This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

MnasNet: Platform-Aware Neural Architecture Search for Mobile

An automated mobile neural architecture search (MNAS) approach, which explicitly incorporate model latency into the main objective so that the search can identify a model that achieves a good trade-off between accuracy and latency.