• Corpus ID: 2097418

Dynamic Filter Networks

@article{Jia2016DynamicFN,
  title={Dynamic Filter Networks},
  author={Xu Jia and Bert De Brabandere and Tinne Tuytelaars and Luc Van Gool},
  journal={ArXiv},
  year={2016},
  volume={abs/1605.09673}
}
In a traditional convolutional layer, the learned filters stay fixed after training. [] Key Method Moreover, multiple such layers can be combined, e.g. in a recurrent architecture. We demonstrate the effectiveness of the dynamic filter network on the tasks of video and stereo prediction, and reach state-of-the-art performance on the moving MNIST dataset with a much smaller model. By visualizing the learned filters, we illustrate that the network has picked up flow information by only looking at unlabelled…

Figures and Tables from this paper

Dynamic Steerable Blocks in Deep Residual Networks
TLDR
This work investigates the generalized notion of frames designed with image properties in mind, as alternatives to this parametrization, and shows that frame-based ResNets and Densenets can improve performance on Cifar-10+ consistently, while having additional pleasant properties like steerability.
Decoupled Dynamic Filter Networks
TLDR
The Decoupled Dynamic Filter (DDF) is proposed, Inspired by recent advances in attention, DDF decouples a depth-wise dynamic filter into spatial and channel dynamic filters that can simultaneously tackle both content-agnostic and computation-heavy convolution.
Pixel-Adaptive Convolutional Neural Networks
TLDR
A pixel-adaptive convolution (PAC) operation, a simple yet effective modification of standard convolutions, in which the filter weights are multiplied with a spatially varying kernel that depends on learnable, local pixel features, is proposed.
Dynamic Steerable Frame Networks
TLDR
This paper lays out the foundations of Frame-based convolutional networks and Dynamic Steerable Frame Networks while illustrating their advantages for continuously transforming features and data-efficient learning.
Learning to generate filters for convolutional neural networks
TLDR
The proposed method to generate sample-specific filters for convolutional layers in the forward pass is evaluated and shows that the classification accuracy of the baseline model can be improved by using the proposed filter generation method.
Adaptive Convolutions with Per-pixel Dynamic Filter Atom
TLDR
This paper proposes a method for de-composing filters, adapted to each spatial position, over dynamic filter atoms generated by a light-weight network from local features, and preserves the appealing properties of conventional convolutions as being translation-equivariant and parametrically efficient.
Weight : Reshape : Residual connection : Sigmoid : Mean subtraction
TLDR
This paper proposes a CNN architecture and its efficient implementation, called the deformable kernel network (DKN), that outputs sets of neighbors and the corresponding weights adaptively for each pixel, and shows that the weighted averaging process with sparsely sampled 3 × 3 kernels outperforms the state of the art by a significant margin.
A Simple and Light-Weight Attention Module for Convolutional Neural Networks
TLDR
This work studies the effect of attention in convolutional neural networks and presents the idea in a simple self-contained module, called Bottleneck Attention Module (BAM), which efficiently produces the attention map along two factorized axes, channel and spatial with negligible overheads.
Dynamic Sampling Convolutional Neural Networks
TLDR
The Dynamic Sampling Convolutional Neural Networks (DSCNN), where the position-specific kernels learn from not only the current position but also multiple sampled neighbour regions, efficiently alleviates the overfitting issue caused by much more parameters than normal CNNs.
Integrating Multiple Receptive Fields Through Grouped Active Convolution
  • Yunho Jeon, Junmo Kim
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2021
TLDR
This paper provides a detailed analysis of the previously proposed convolution unit and shows that it is an efficient representation of a sparse weight convolution and suggests a depthwise ACU, which can replace the existing convolutions.
...
...

References

SHOWING 1-10 OF 30 REFERENCES
Spatial Transformer Networks
TLDR
This work introduces a new learnable module, the Spatial Transformer, which explicitly allows the spatial manipulation of data within the network, and can be inserted into existing convolutional architectures, giving neural networks the ability to actively spatially transform feature maps.
Deep Stereo: Learning to Predict New Views from the World's Imagery
TLDR
This work presents a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets, and is the first to apply deep learning to the problem ofnew view synthesis from sets of real-world, natural imagery.
Conditioned Regression Models for Non-blind Single Image Super-Resolution
TLDR
This paper proposes conditioned regression models (including convolutional neural networks and random forests) that can effectively exploit the additional kernel information during both, training and inference and empirically shows that they can effectively handle scenarios where the blur kernel is different for each image.
Deep multi-scale video prediction beyond mean square error
TLDR
This work trains a convolutional network to generate future frames given an input sequence and proposes three different and complementary feature learning strategies: a multi-scale architecture, an adversarial training method, and an image gradient difference loss function.
Deep Residual Learning for Image Recognition
TLDR
This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Spatio-temporal video autoencoder with differentiable memory
TLDR
One direct application of the proposed framework in weakly-supervised semantic segmentation of videos through label propagation using optical flow is presented, using as temporal decoder a robust optical flow prediction module together with an image sampler serving as built-in feedback loop.
Unsupervised Learning of Video Representations using LSTMs
TLDR
This work uses Long Short Term Memory networks to learn representations of video sequences and evaluates the representations by finetuning them for a supervised learning problem - human action recognition on the UCF-101 and HMDB-51 datasets.
Identity Mappings in Deep Residual Networks
TLDR
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation.
FlowNet: Learning Optical Flow with Convolutional Networks
TLDR
This paper constructs CNNs which are capable of solving the optical flow estimation problem as a supervised learning task, and proposes and compares two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations.
Generative Adversarial Nets
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a
...
...