Chainer: A Deep Learning Framework for Accelerating the Research Cycle

@article{Tokui2019ChainerAD,
  title={Chainer: A Deep Learning Framework for Accelerating the Research Cycle},
  author={Seiya Tokui and Ryosuke Okuta and Takuya Akiba and Yusuke Niitani and Toru Ogawa and Shunta Saito and Shuji Suzuki and Kota Uenishi and Brian K. Vogel and Hiroyuki Yamazaki Vincent},
  journal={Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining},
  year={2019}
}
Software frameworks for neural networks play a key role in the development and application of deep learning methods. In this paper, we introduce the Chainer framework, which intends to provide a flexible, intuitive, and high performance means of implementing the full range of deep learning models needed by researchers and practitioners. Chainer provides acceleration using Graphics Processing Units with a familiar NumPy-like API through CuPy, supports general and dynamic models in Python through… Expand
Pixyz: a library for developing deep generative models
  • Masahiro Suzuki, Takaaki Kaneko, Y. Matsuo
  • Computer Science
  • ArXiv
  • 2021
TLDR
A new DGM library called Pixyz is proposed that is faster than existing probabilistic modeling languages in learning simple DGMs and can be used to implement complex DGMs in a simple and concise manner, which is difficult to do with existing libraries. Expand
NNBlocks: a Blockly framework for AI computing
TLDR
A visual concept approach that can execute artificial intelligence (AI) computing using block-based tools with AI knowledge is proposed and a web-based NNBlocks framework that uses this approach to integrate with TVM is developed. Expand
TensorX: Extensible API for Neural Network Model Design and Deployment
TLDR
TensorX is a Python library for prototyping, design, and deployment of complex neural network models in TensorFlow, aiming to make available high-level components like neural network layers that are, in effect, stateful functions, easy to compose and reuse. Expand
Swift for TensorFlow: A portable, flexible platform for deep learning
TLDR
Deep learning platform Swift for TensorFlow combines a language-integrated automatic differentiation system and multiple Tensor implementations within a modern ahead-of-time compiled language oriented around mutable value semantics. Expand
PrototypeML: A Neural Network Integrated Design and Development Environment
TLDR
This paper details the deep learning development deficiencies that drove the implementation of PrototypeML, and proposes a hybrid approach to resolve these issues without limiting network expressiveness or reducing code quality. Expand
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
TLDR
This work summarizes and compares the works for four leading platforms for the execution of algorithms such as CPU, GPU, FPGA and ASIC describing the main solutions of the state-of-the-art, giving much prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. Expand
Einconv: Exploring Unexplored Tensor Decompositions for Convolutional Neural Networks
TLDR
A decomposition class specific to CNNs is characterized by adopting a flexible graphical notation that includes such well-known CNN modules as depthwise separable convolution layers and bottleneck layers, but also previously unknown modules with nonlinear activations. Expand
Einconv: Exploring Unexplored Tensor Network Decompositions for Convolutional Neural Networks
Tensor decomposition methods are widely used for model compression and fast inference in convolutional neural networks (CNNs). Although many decompositions are conceivable, only CP decomposition andExpand
Scalable and Practical Natural Gradient for Large-Scale Deep Learning
TLDR
Scalable and Practical Natural Gradient Descent (SP-NGD), a principled approach for training models that allows them to attain similar generalization performance to models trained with first-order optimization methods, but with accelerated convergence, is proposed. Expand
Kornia: an Open Source Differentiable Computer Vision Library for PyTorch
TLDR
Kornia is composed of a set of modules containing operators that can be inserted inside neural networks to train models to perform image transformations, camera calibration, epipolar geometry, and low level image processing techniques, such as filtering and edge detection that operate directly on high dimensional tensor representations. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 51 REFERENCES
Caffe: Convolutional Architecture for Fast Feature Embedding
TLDR
Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Expand
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
TLDR
The API design and the system implementation of MXNet are described, and it is explained how embedding of both symbolic expression and tensor operation is handled in a unified fashion. Expand
Large Scale Distributed Deep Networks
TLDR
This paper considers the problem of training a deep network with billions of parameters using tens of thousands of CPU cores and develops two algorithms for large-scale distributed training, Downpour SGD and Sandblaster L-BFGS, which increase the scale and speed of deep network training. Expand
Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train
TLDR
The challenges and novel solutions needed in order to train ResNet-50 in this large scale environment are described and the novel Collapsed Ensemble (CE) technique is introduced that allows for a 77.5\% top-1 accuracy, similar to that of a Res net-152, while training a unmodified Res Net-50 topology for the same fixed training budget. Expand
An introduction to computational networks and the computational network toolkit (invited talk)
TLDR
The computational network toolkit (CNTK), an implementation of CN that supports both GPU and CPU, is introduced and the architecture and the key components of the CNTK are described, the command line options to use C NTK, and the network definition and model editing language are described. Expand
Mesh-TensorFlow: Deep Learning for Supercomputers
TLDR
Mesh-TensorFlow is introduced, a language for specifying a general class of distributed tensor computations and used to implement an efficient data-parallel, model-Parallel version of the Transformer sequence-to-sequence model, surpassing state of the art results on WMT'14 English- to-French translation task and the one-billion-word language modeling benchmark. Expand
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
TLDR
The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields. Expand
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier. Expand
Horovod: fast and easy distributed deep learning in TensorFlow
TLDR
Horovod is an open source library that improves on both obstructions to scaling: it employs efficient inter-GPU communication via ring reduction and requires only a few lines of modification to user code, enabling faster, easier distributed training in TensorFlow. Expand
Squeeze-and-Excitation Networks
TLDR
This work proposes a novel architectural unit, which is term the “Squeeze-and-Excitation” (SE) block, that adaptively recalibrates channel-wise feature responses by explicitly modelling interdependencies between channels and shows that these blocks can be stacked together to form SENet architectures that generalise extremely effectively across different datasets. Expand
...
1
2
3
4
5
...