Corpus ID: 146121430

Effectiveness of Self Normalizing Neural Networks for Text Classification

  title={Effectiveness of Self Normalizing Neural Networks for Text Classification},
  author={Avinash Madasu and Vijjini Anvesh Rao},
Self Normalizing Neural Networks(SNN) proposed on Feed Forward Neural Networks(FNN) outperform regular FNN architectures in various machine learning tasks. [...] Key Result Our experiments demonstrate that SCNN achieves comparable results to standard CNN model with significantly fewer parameters. Furthermore it also outperforms CNN with equal number of parameters.Expand
3 Citations
Sequential Learning of Convolutional Features for Effective Text Classification
This paper presents an experimental study on the fundamental blocks of CNNs in text categorization and proposes Sequential Convolutional Attentive Recurrent Network (SCARN), a model that achieves better performance compared to equally large various deep CNN and LSTM architectures. Expand
Prediction of Soybean Yield using Self-normalizing Neural Networks
The results show that SNN can achieve a lower prediction error with sufficiently large training data compared with traditional machine learning methods, and other Deep Learning techniques, such as Batch Normalization. Expand
A Sentiwordnet Strategy for Curriculum Learning in Sentiment Analysis
The ideas of curriculum learning, driven by SentiWordNet in a sentiment analysis setting, are applied and the effectiveness of the proposed strategy is presented. Expand


Self-Normalizing Neural Networks
Self-normalizing neural networks (SNNs) are introduced to enable high-level abstract representations and it is proved that activations close to zero mean and unit variance that are propagated through many network layers will converge towards zero meanand unit variance -- even under the presence of noise and perturbations. Expand
Recurrent Convolutional Neural Networks for Text Classification
A recurrent convolutional neural network is introduced for text classification without human-designed features to capture contextual information as far as possible when learning word representations, which may introduce considerably less noise compared to traditional window-based neural networks. Expand
Convolutional Neural Networks for Sentence Classification
The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors. Expand
Very Deep Convolutional Networks for Text Classification
This work presents a new architecture (VDCNN) for text processing which operates directly at the character level and uses only small convolutions and pooling operations, and is able to show that the performance of this model increases with the depth. Expand
ImageNet classification with deep convolutional neural networks
A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective. Expand
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Expand
Gradient-based learning applied to document recognition
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques. Expand
Dropout: a simple way to prevent neural networks from overfitting
It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. Expand
Understanding the difficulty of training deep feedforward neural networks
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. Expand
Sentiment Analysis of Tweets in Malayalam Using Long Short-Term Memory Units and Convolutional Neural Nets
The current paper is first in its attempt to perform sentiment analysis of tweets in Malayalam language using LSTM and CNN, and presents 10-fold cross-validation results. Expand