TanhExp: A Smooth Activation Function with High Convergence Speed for Lightweight Neural Networks

@article{Liu2021TanhExpAS,
  title={TanhExp: A Smooth Activation Function with High Convergence Speed for Lightweight Neural Networks},
  author={Xinyu Liu and Xiaoguang Di},
  journal={ArXiv},
  year={2021},
  volume={abs/2003.09855}
}
Lightweight or mobile neural networks used for real-time computer vision tasks contain fewer parameters than normal networks, which lead to a constrained performance. In this work, we proposed a novel activation function named Tanh Exponential Activation Function (TanhExp) which can improve the performance for these networks on image classification task significantly. The definition of TanhExp is f(x) = xtanh(e^x). We demonstrate the simplicity, efficiency, and robustness of TanhExp on various… 
TanhSoft - a family of activation functions combining Tanh and Softplus
TLDR
This work proposes a family of novel activation functions, namely TanhSoft, with four undetermined hyper-parameters of the form tanh({alphax+{\beta}e^{{\gamma}x})ln({delta}+e^x) and tunes these hyper- parameters to obtain activation functions which are shown to outperform several well known activation functions.
TeLU: A New Activation Function for Deep Learning
TLDR
It is proved that the activation functions TeLU and TeLU learnable give better results than other popular activation functions, including ReLU, Mish, TanhExp, using current architectures tested on Computer Vision datasets.
Smooth activations and reproducibility in deep networks
TLDR
A new family of activations; Smooth ReLU (SmeLU) is proposed, designed to give better tradeoffs, while also keeping the mathematical expression simple, and thus training speed fast and implementation cheap, and demonstrating the superior accuracy-reproducibility tradeoffs with smooth activation, SmeLU in particular.
Activation ensemble generative adversarial network transfer learning for image classification
TLDR
Experimental results on five benchmark datasets show that when only a few samples are available for training a target task, leveraging datasets from other related datasets by AE-GAN can significantly improve the performance for image classification with a small set of samples.
Compacting Deep Neural Networks for Internet of Things: Methods and Applications
TLDR
This article categorizes compacting-DNNs technologies into three major types: 1) network model compression; 2) knowledge distillation (KD); and 3) modification of network structures.
Vector encoded bounding box regression for detecting remote-sensing objects with anchor-free methods
TLDR
A novel convolutional neural network architecture used for detecting objects in high-resolution remote-sensing images is proposed, which is totally anchor-free and shows the most favourable result in the detecting accuracy with no extra trainable parameters added.
Image and Graphics Technologies and Applications: 15th Chinese Conference, IGTA 2020, Beijing, China, September 19, 2020, Revised Selected Papers
TLDR
In the proposed algorithm, a densely connected dilated convolution network module with different dilation rate is added to enhance the network’s ability to generate image features at different scale levels; a channel attention mechanism is introduced in the network to adaptively select the generated image features, which improves the quality of thegenerated image on the network.
Neural Density-Distance Fields
TLDR
This paper proposes Neural Density-Distance Field (NeDDF), a novel 3D representation that reciprocally constrains the distance and density fields and extends distance field formulation to shapes with no explicit boundary surface, which enable explicit conversion from distance field to density field.
An Integrated Change Detection Method Based on Spectral Unmixing and the CNN for Hyperspectral Imagery
Hyperspectral remote sensing image (HSI) include rich spectral information that can be very beneficial for change detection (CD) technology. Due to the existence of many mixed pixels, pixel-wise
Lightweight and efficient neural network with SPSA attention for wheat ear detection
TLDR
A lightweight and efficient wheat ear detector with Shuffle Polarized Self-Attention (SPSA) is proposed in this paper, which achieves superior detection performance compared with other state-of-the-art approaches.
...
...

References

SHOWING 1-10 OF 41 REFERENCES
Mish: A Self Regularized Non-Monotonic Neural Activation Function
TLDR
A novel neural activation function called Mish, similar to Swish along with providing a boost in performance and its simplicity in implementation makes it easier for researchers and developers to use Mish in their Neural Network Models.
Empirical Evaluation of Rectified Activations in Convolutional Network
TLDR
The experiments suggest that incorporating a non-zero slope for negative part in rectified activation units could consistently improve the results, and are negative on the common belief that sparsity is the key of good performance in ReLU.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
TLDR
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
TLDR
An extremely computation-efficient CNN architecture named ShuffleNet is introduced, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs), to greatly reduce computation cost while maintaining accuracy.
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
TLDR
The "exponential linear unit" (ELU) which speeds up learning in deep neural networks and leads to higher classification accuracies and significantly better generalization performance than ReLUs and LReLUs on networks with more than 5 layers.
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
TLDR
This work introduces two simple global hyper-parameters that efficiently trade off between latency and accuracy and demonstrates the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.
Deep Learning with S-Shaped Rectified Linear Activation Units
TLDR
A novel S-shaped rectifiedlinear activation unit (SReLU) to learn both convex and non-convex functions, imitating the multiple function forms given by the two fundamental laws, namely the Webner-Fechner law and the Stevens law, in psychophysics and neural sciences is proposed.
Mish: A Self Regularized Non-Monotonic Activation Function
TLDR
Mish, a novel self-regularized non-monotonic activation function which can be mathematically defined as f (x) = x tanh(so f t plus(x)), is proposed, which validated experimentally on several well-known benchmarks against the best combinations of architectures and activation functions.
Searching for Activation Functions
TLDR
The experiments show that the best discovered activation function, f(x) = x \cdot \text{sigmoid}(\beta x)$, which is named Swish, tends to work better than ReLU on deeper models across a number of challenging datasets.
Self-Normalizing Neural Networks
TLDR
Self-normalizing neural networks (SNNs) are introduced to enable high-level abstract representations and it is proved that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero meanand unit variance -- even under the presence of noise and perturbations.
...
...