Local Critic Training for Model-Parallel Learning of Deep Neural Networks

  title={Local Critic Training for Model-Parallel Learning of Deep Neural Networks},
  author={Hojung Lee and Cho-Jui Hsieh and Jong-Seok Lee},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
In this article, we propose a novel model-parallel learning method, called local critic training, which trains neural networks using additional modules called local critic networks. The main network is divided into several layer groups, and each layer group is updated through error gradients estimated by the corresponding local critic network. We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and… 

Trinity: Neural Network Adaptive Distributed Parallel Training Method Based on Reinforcement Learning

Trinity, an adaptive distributed parallel training method based on reinforcement learning, is presented to automate the search and tuning of parallel strategies and achieves up to 5% reductions in runtime, communication, and memory overhead, and up to a 40% increase in parallel strategy search speeds.

Penalty and Augmented Lagrangian Methods for Layer-parallel Training of Residual Networks

A layer-parallel training algorithm is proposed to overcome the scalability barrier caused by the serial nature of forward-backward propagation in deep residual learning and can provide speedup over the traditional layer-serial training methods.

Mapping DCNN to a Three Layer Modular Architecture: A Systematic Way for Obtaining Wider and More Effective Network

We propose a modular Deep Convolutional Neural Network (DCNN) architecture which has the property of block-like design and re-usage of parameters by certain blocks. We leverage networks from the

Efficient Neuromorphic Hardware Through Spiking Temporal Online Local Learning

This work introduces an effective hardware-friendly local training algorithm compatible with sparse temporal input coding and binary random classification weights, and explores spike sparsity in communication, parallelism in vector–matrix operations and process-level dataflow, and locality of training errors, which leads to low cost and fast training speed.

MS-NET: modular selective network

The modular nature and low parameter requirement of the network makes it very suitable in distributed and low computational environments and plays a vital role in the performance of thenetwork.

BackLink: Supervised Local Training with Backward Links

in simulation runtime in ResNet110 compared to the standard BP. Therefore, our method could create new opportunities for improving training algorithms towards better efficiency and biological



Local Critic Training of Deep Neural Networks

A novel approach to train deep neural networks by unlocking the layer-wise dependency of backpropagation training, which is also useful from multi-model perspectives, including structural optimization of neural networks, computationally efficient progressive inference, and ensemble classification for performance improvement.

Training Neural Networks Using Features Replay

This work proposes a novel parallel-objective formulation for the objective function of the neural network, and introduces features replay algorithm and proves that it is guaranteed to converge to critical points for the non-convex problem under certain conditions.

Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks

The experiments show that layer-wise parallelism outperforms current parallelization approaches by increasing training speed, reducing communication costs, achieving better scalability to multiple GPUs, while maintaining the same network accuracy.

Neural Architecture Search with Reinforcement Learning

This paper uses a recurrent network to generate the model descriptions of neural networks and trains this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.

PruneTrain: fast neural network training by dynamic sparse model reconfiguration

This work proposes PruneTrain, a cost-efficient mechanism that gradually reduces the training cost during training by using a structured group-lasso regularization approach that drives the training optimization toward both high accuracy and small weight values.

Integrated Model, Batch, and Domain Parallelism in Training Neural Networks

We propose a new integrated method of exploiting model, batch and domain parallelism for the training of deep neural networks (DNNs) on large distributed-memory computers using minibatch stochastic

Training Neural Networks Without Gradients: A Scalable ADMM Approach

This paper explores an unconventional training method that uses alternating direction methods and Bregman iteration to train networks without gradient descent steps, and exhibits strong scaling in the distributed setting, yielding linear speedups even when split over thousands of cores.

BranchyNet: Fast inference via early exiting from deep neural networks

The BranchyNet architecture is presented, a novel deep network architecture that is augmented with additional side branch classifiers that can both improve accuracy and significantly reduce the inference time of the network.

Sobolev Training for Neural Networks

Sobolev Training for neural networks is introduced, which is a method for incorporating target derivatives in addition the to target values while training, and results in models with higher accuracy and stronger generalisation on three distinct domains.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition