Corpus ID: 235435901

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

@inproceedings{Malinowski2021GradientFF,
  title={Gradient Forward-Propagation for Large-Scale Temporal Video Modelling},
  author={Mateusz Malinowski and Dimitrios Vytiniotis and Grzegorz Swirszcz and Viorica Patraucean and Jo{\~a}o F. M. Carreira},
  booktitle={CVPR},
  year={2021}
}
How can neural networks be trained on large-volume temporal data efficiently? To compute the gradients required to update parameters, backpropagation blocks computations until the forward and backward passes are completed. For temporal signals, this introduces high latency and hinders real-time learning. It also creates a coupling between consecutive layers, which limits model parallelism and increases memory consumption. In this paper, we build upon Sideways, which avoids blocking by… Expand

References

SHOWING 1-10 OF 75 REFERENCES
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
TLDR
GPipe is introduced, a pipeline parallelism library that allows scaling any network that can be expressed as a sequence of layers by pipelining different sub-sequences of layers on separate accelerators, resulting in almost linear speedup when a model is partitioned across multiple accelerators. Expand
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
TLDR
I3D models considerably improve upon the state-of-the-art in action classification, reaching 80.2% on HMDB-51 and 97.9% on UCF-101 after pre-training on Kinetics, and a new Two-Stream Inflated 3D Conv net that is based on 2D ConvNet inflation is introduced. Expand
Identity Mappings in Deep Residual Networks
TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. Expand
WaveNet: A Generative Model for Raw Audio
TLDR
WaveNet, a deep neural network for generating raw audio waveforms, is introduced; it is shown that it can be efficiently trained on data with tens of thousands of samples per second of audio, and can be employed as a discriminative model, returning promising results for phoneme recognition. Expand
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
TLDR
This work introduces UCF101 which is currently the largest dataset of human actions and provides baseline action recognition results on this new dataset using standard bag of words approach with overall performance of 44.5%. Expand
HMDB: A large video database for human motion recognition
TLDR
This paper uses the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube, to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions. Expand
A Practical Sparse Approximation for Real Time Recurrent Learning
TLDR
The Sparse n-step Approximation (SnAp) is introduced to the RTRL influence matrix, which only keeps entries that are nonzero within n steps of the recurrent core, and substantially outperforms other R TRL approximations with comparable costs such as Unbiased Online Recurrent Optimization. Expand
Approximating Real-Time Recurrent Learning with Random Kronecker Factors
TLDR
It is shown that KF-RTRL is an unbiased and memory efficient online learning algorithm that captures long-term dependencies and almost matches the performance of TBPTT on real world tasks by training Recurrent Highway Networks on a synthetic string memorization task and on the Penn TreeBank task, respectively. Expand
Temporal Reasoning in Videos Using Convolutional Gated Recurrent Units
TLDR
It is found that the temporal order matters more for the recently introduced 20BN Something-Something dataset where the task of fine-grained action recognition necessitates the model to do temporal reasoning. Expand
Kickback Cuts Backprop's Red-Tape: Biologically Plausible Credit Assignment in Neural Networks
TLDR
This paper derives a new credit assignment algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop and provides a sufficient condition for Kickback to follow error gradients, and shows that Kickback matches Backprop's performance on real-world regression benchmarks. Expand
...
1
2
3
4
5
...