Inception Convolution with Efficient Dilation Search

  title={Inception Convolution with Efficient Dilation Search},
  author={Jie Liu and Chuming Li and Feng Liang and Chen Lin and Ming Sun and Junjie Yan and Wanli Ouyang and Dong Xu},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Jie LiuChuming Li Dong Xu
  • Published 25 December 2020
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
As a variant of standard convolution, a dilated convolution can control effective receptive fields and handle large scale variance of objects without introducing additional computational costs. To fully explore the potential of dilated convolution, we proposed a new type of dilated convolution (referred to as inception convolution), where the convolution operations have independent dilation patterns among different axes, channels and layers. To develop a practical method for learning complex… 

Lightened Context Extraction Network for Object Detection

  • Huang JiaxuanYu LeiYin Junping
  • Computer Science
    2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP)
  • 2022
This work follows part of AC-FPN(Attention-guided Context Feature Pyramid Network) to stack feature maps after every dilation convolution to lighten the model by Channel Group Average without bringing in new parameters.

DPFNet: A Dual-branch Dilated Network with Phase-aware Fourier Convolution for Low-light Image Enhancement

This work proposes a novel module using the Fourier coefficients, which can recover high-quality texture details under the constraint of semantics in the frequency phase and supplement the spatial domain to alleviate the loss of detail caused by frequent downsampling.

Neural Architecture Adaptation for Object Detection by Searching Channel Dimensions and Mapping Pre-trained Parameters

This paper proposes to adapt both the micro- and macro-architecture by searching for specific operations and the number of layers, in addition to the output channel dimensions of each block, to optimize the given backbone for detection purposes.

Ship Detection via Dilated Rate Search and Attention-Guided Feature Representation

A dilated convolution parameter search strategy is presented to adaptively select the dilated rate for the multi-branch extraction architecture, adaptively obtaining context information of different receptive fields without sacrificing the image resolution.

GCCN: Global Context Convolutional Network

GCCN has significantly improved on the accuracy of state-ofthe-art prototypical and matching networks by up to 30% in different few-shot learning scenarios.

SDWNet: A Straight Dilated Network with Wavelet Transformation for image Deblurring

Qualitative and quantitative evaluations of real and synthetic datasets show that the deblurring method is comparable to existing algorithms in terms of performance with much lower training requirements.

BN-NAS: Neural Architecture Search with Batch Normalization

BN-NAS can significantly reduce the time required by model training and evaluation in NAS and a BN-based indicator for predicting subnet performance at a very early training stage is proposed for fast evaluation.

A Real-Time Bridge Crack Detection Method Based on an Improved Inception-Resnet-v2 Structure

An end-to-end bridge crack detection model based on a convolutional neural network that combines the advantages of Inception convolution and residual networks, broadening the network width and alleviating the training problem of the deep network is proposed.

GLiT: Neural Architecture Search for Global and Local Image Transformer

This work introduces the first Neural Architecture Search (NAS) method to find a better transformer architecture for image recognition and introduces a locality module that models the local correlations in images explicitly with fewer computational cost.

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

A novel neural architecture search (NAS) method, termed ViP-NAS, to search networks in both spatial and temporal levels for fast online video pose estimation, which is the first to search for the temporal feature fusion and automatic computation allocation in videos.



Cascade R-CNN: High Quality Object Detection and Instance Segmentation

A multi-stage object detection architecture, the Cascade R-CNN, composed of a sequence of detectors trained with increasing IoU thresholds, which significantly improves high-quality detection on generic and specific object datasets, including VOC, KITTI, CityPerson, and WiderFace.

Mask R-CNN

This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.

Feature Pyramid Networks for Object Detection

This paper exploits the inherent multi-scale, pyramidal hierarchy of deep convolutional networks to construct feature pyramids with marginal extra cost and achieves state-of-the-art single-model results on the COCO detection benchmark without bells and whistles.

Aggregated Residual Transformations for Deep Neural Networks

On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

Computation Reallocation for Object Detection

This work presents CR-NAS (Computation Reallocation Neural Architecture Search), a novel hierarchical search procedure that can learn computation reallocation strategies across different feature resolution and spatial position diectly on the target detection dataset.

Efficient Neural Architecture Transformation Searchin Channel-Level for Object Detection

This paper proposes a novel neural architecture search strategy in channel-level instead of path-level and devise a search space specially targeting at object detection, which could be discovered to adapt a network designed for image classification to task of object detection.

End-to-End Object Detection with Transformers

This work presents a new method that views object detection as a direct set prediction problem, and demonstrates accuracy and run-time performance on par with the well-established and highly-optimized Faster RCNN baseline on the challenging COCO object detection dataset.

DARTS: Differentiable Architecture Search

The proposed algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.