Building Efficient Deep Neural Networks With Unitary Group Convolutions

  title={Building Efficient Deep Neural Networks With Unitary Group Convolutions},
  author={Ritchie Zhao and Yuwei Hu and Jordan Dotzel and Christopher De Sa and Zhiru Zhang},
  journal={2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Ritchie Zhao, Yuwei Hu, Zhiru Zhang
  • Published 19 November 2018
  • Computer Science
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
We propose unitary group convolutions (UGConvs), a building block for CNNs which compose a group convolution with unitary transforms in feature space to learn a richer set of representations than group convolution alone. [] Key Result HadaNets achieve similar accuracy to circulant networks with lower computation complexity, and better accuracy than ShuffleNets with the same number of parameters and floating-point multiplies.

Figures and Tables from this paper

Butterfly Transform: An Efficient FFT Based Neural Architecture Design

It is shown that extending the butterfly operations from the FFT algorithm to a general Butterfly Transform (BFT) can be beneficial in building an efficient block structure for CNN designs, and ShuffleNet-V2+BFT outperforms state-of-the-art architecture search methods MNasNet, FBNet and MobilenetV3 in the low FLOP regime.

AutoShuffleNet: Learning Permutation Matrices via an Exact Lipschitz Continuous Penalty in Deep Convolutional Neural Networks

This paper introduces an exact Lipschitz continuous non-convex penalty so that it can be incorporated in the stochastic gradient descent to approximate permutation at high precision and proves theoretically the exactness (error bounds) in recovering permutation matrices when the penalty function is zero.

Substituting Convolutions for Neural Network Compression

This paper proposes a simple compression technique that is general, easy to apply, and requires minimal tuning, and is able to leverage a number of methods that have been developed as efficient alternatives to fully-connected layers for pointwise substitution for Pareto-optimal benefits in efficiency/accuracy.

Specializing CGRAs for Light-Weight Convolutional Neural Networks

  • Jungi LeeJongeun Lee
  • Computer Science
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • 2022
This article identifies a small set of architectural features on a baseline CGRA to enable high-performance mapping of depthwise convolution (DWC) and pointwise Convolution (PWC) kernels, which are the most important building block in recent light-weight DNN models.


A family of matrices called kaleidoscope matrices (K-matrices) are introduced that provably capture any structured matrix with near-optimal space (parameter) and time (arithmetic operation) complexity and can be automatically learned within end-to-end pipelines to improve model quality.

SPEC2: SPECtral SParsE CNN Accelerator on FPGAs

  • Yue NiuHanqing Zeng V. Prasanna
  • Computer Science
    2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC)
  • 2019
A systematic pruning algorithm based on Alternative Direction Method of Multipliers (ADMM) and an optimized pipeline architecture on FPGA that has efficient random access into the sparse kernels and exploits various dimensions of parallelism in convolutional layers achieves high inference throughput with extremely low computation complexity and negligible accuracy degradation.

RingCNN: Exploiting Algebraically-Sparse Ring Tensors for Energy-Efficient CNN-Based Computational Imaging

  • Chao-Tsung Huang
  • Computer Science
    2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)
  • 2021
This paper proposes to build CNN models based on ring algebra that defines multiplication, addition, and non-linearity for n-tuples properly, and defines and unify several variants of ring algebras into a modeling framework, RingCNN, and makes comparisons in terms of image quality and hardware complexity.

Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps

A family of matrices called kaleidoscope matrices (K-matrices) are introduced that provably capture any structured matrix with near-optimal space (parameter) and time (arithmetic operation) complexity that can be automatically learned within end-to-end pipelines to replace hand-crafted procedures.

Toward Hardware-Efficient Optical Neural Networks: Beyond FFT Architecture via Joint Learnability

  • Jiaqi GuZheng Zhao D. Pan
  • Computer Science
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
  • 2021
A novel optical microdisk-based convolutional neural network architecture with joint learnability is proposed as an extension to move beyond Fourier transform and multi-layer perception, enabling hardware-aware ONN design space exploration with lower area cost, higher power efficiency, and better noiserobustness.

XCAT - Lightweight Quantized Single Image Super-Resolution using Heterogeneous Group Convolutions and Cross Concatenation

Comparative experimental results on slightly modified XCAT models reveal that the design choices proposed in this study offer the model to be deployed on mobile devices efficiently, and the effectiveness of the proposed method is evaluated with standardized datasets.



IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks

It is empirically demonstrate that the combination of low-rank and sparse kernels boosts the performance and the superiority of the proposed approach to the state-of-the-arts, IGCV2 and MobileNetV2 over image classification on CIFAR and ImageNet and object detection on COCO.

Aggregated Residual Transformations for Deep Neural Networks

On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity.

Xception: Deep Learning with Depthwise Separable Convolutions

  • François Chollet
  • Computer Science
    2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
This work proposes a novel deep convolutional neural network architecture inspired by Inception, where Inception modules have been replaced with depthwise separable convolutions, and shows that this architecture, dubbed Xception, slightly outperforms Inception V3 on the ImageNet dataset, and significantly outperforms it on a larger image classification dataset.

IGCV$2$: Interleaved Structured Sparse Convolutional Neural Networks

Experimental results demonstrate the advantage on the balance among these three aspects compared to interleaved group convolutions and Xception, and competitive performance compared to other state-of-the-art architecture design methods.

Interleaved Structured Sparse Convolutional Neural Networks

This paper presents a modularized building block, IGC-V2: interleaved structured sparse convolutions, which generalizes interleaves group convolutions to the product of more structured sparse kernels, further eliminating the redundancy.

An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections

We explore the redundancy of parameters in deep neural networks by replacing the conventional linear projection in fully-connected layers with the circulant projection. The circulant structure

Network In Network

With enhanced local modeling via the micro network, the proposed deep network structure NIN is able to utilize global average pooling over feature maps in the classification layer, which is easier to interpret and less prone to overfitting than traditional fully connected layers.

Very Deep Convolutional Networks for Large-Scale Image Recognition

This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

Energy Efficient Hadamard Neural Networks

A novel energy efficient model Binary Weight and Hadamard-transformed Image Network (BWHIN) is proposed, which is a combination of Binary Weight Network and HadAmard- transformed image Network and it is observed that energy efficiency is achieved with a slight sacrifice at classification accuracy.

CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices

  • Caiwen DingSiyu Liao Bo Yuan
  • Computer Science
    2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO)
  • 2017
The CirCNN architecture is proposed, a universal DNN inference engine that can be implemented in various hardware/software platforms with configurable network architecture (e.g., layer type, size, scales, etc) and FFT can be used as the key computing kernel which ensures universal and small-footprint implementations.