Joint Training of Generic CNN-CRF Models with Stochastic Optimization

  title={Joint Training of Generic CNN-CRF Models with Stochastic Optimization},
  author={Alexander Kirillov and Dmitrij Schlesinger and Shuai Zheng and Bogdan Savchynskyy and Philip H. S. Torr and Carsten Rother},
We propose a new CNN-CRF end-to-end learning framework, which is based on joint stochastic optimization with respect to both Convolutional Neural Network (CNN) and Conditional Random Field (CRF) parameters. While stochastic gradient descent is a standard technique for CNN training, it was not used for joint models so far. We show that our learning method is (i) general, i.e. it applies to arbitrary CNN and CRF architectures and potential functions; (ii) scalable, i.e. it has a low memory… 

Learning Arbitrary Potentials in CRFs with Gradient Descent

A new inference and learning framework which can learn arbitrary pairwise CRF potentials is introduced which can easily be integrated in deep neural networks to allow for end-to-end training.

A Projected Gradient Descent Method for CRF Inference Allowing End-to-End Training of Arbitrary Pairwise Potentials

This paper develops a new inference and learning framework which can learn pairwise CRF potentials restricted only by their dependence on the image pixel values and the size of the support and empirically demonstrated that such learned potentials can improve segmentation accuracy.

Learning Arbitrary Pairwise Potentials in CRFs for Semantic Segmentation

A new inference and learning framework which can learn arbitrary pairwise CRF potentials which can improve segmentation accuracy and that certain label class interactions are indeed better modelled by a non-Gaussian potential is developed.

SGM-Nets: Semi-Global Matching with Neural Networks

  • A. SekiM. Pollefeys
  • Computer Science
    2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
A novel SGM parameterization, which deploys different penalties depending on either positive or negative disparity changes in order to represent the object structures more discriminatively, is proposed.

End-to-End Learning of Deep Structured Models for Semantic Segmentation

This thesis summarizes the content of three papers, all of them presenting solutions to semantic segmentation problems, and focuses on creating robust and accurate models that are possible to train end-to-end, as well as developing corresponding optimization methods needed to enable efficient training.

Revisiting Deep Structured Models for Pixel-Level Labeling with Gradient-Based Inference

A new inference and learning framework which can learn arbitrary pairwise conditional random field (CRF) potentials which can improve segmentation accuracy and prove that certain label-class interactions are indeed better modeled by a non-Gaussian potential.

Conditional Random Fields Meet Deep Neural Networks for Semantic Segmentation: Combining Probabilistic Graphical Models with Deep Learning for Structured Prediction

The literature on combining the modeling power of CRFs with the representation-learning ability of DNNs is reviewed, ranging from early work that combines these two techniques as independent stages of a common pipeline to recent approaches that embed inference of probabilistic models directly in the neural network itself.

Mean-Field methods for Structured Deep-Learning in Computer Vision

This thesis presents a novel proximal gradientbased approach to optimizing the variational objective of mean-field inference, and derives a novel and efficient structured learning method for multi-modal posterior distribution based on the Multi-Modal Mean-Field approximation, which can be seamlessly combined to modern gradient-based learning methods such as CNNs.

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

This paper proposes a method to segment hyperspectral images by considering both spectral and spatial information via a combined framework consisting of CNN and CRF, and introduces a deep deconvolution network that improves the segmentation masks.

Toward Robustness against Label Noise in Training Deep Discriminative Neural Networks

The proposed novel framework for training deep convolutional neural networks from noisy labeled datasets that can be obtained cheaply is applied to the image labeling problem and is shown to be effective in labeling unseen images as well as reducing label noise in training on CIFAR-10 and MS COCO datasets.



Conditional Random Fields as Recurrent Neural Networks

A new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling is introduced, and top results are obtained on the challenging Pascal VOC 2012 segmentation benchmark.

Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation

This work shows how to improve semantic segmentation through the use of contextual information, specifically, ' patch-patch' context between image regions, and 'patch-background' context, and formulate Conditional Random Fields with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches.

Semantic Image Segmentation via Deep Parsing Network

This paper addresses semantic image segmentation by incorporating rich information into Markov Random Field (MRF), including high-order relations and mixture of label contexts by proposing a Convolutional Neural Network (CNN), namely Deep Parsing Network (DPN), which enables deterministic end-to-end computation in a single forward pass.

Fully convolutional networks for semantic segmentation

The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF).

Caffe: Convolutional Architecture for Fast Feature Embedding

Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

Decision tree fields

This paper introduces a new formulation for discrete image labeling tasks, the Decision Tree Field (DTF), that combines and generalizes random forests and conditional random fields (CRF) which have

Learning Deep Structured Models

This paper proposes a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials and demonstrates the effectiveness of this algorithm in the tasks of predicting words from noisy images, as well as tagging of Flickr photographs.