Corpus ID: 207847555

An Algorithm for Routing Capsules in All Domains

@article{Heinsen2019AnAF,
  title={An Algorithm for Routing Capsules in All Domains},
  author={Franz A. Heinsen},
  journal={ArXiv},
  year={2019},
  volume={abs/1911.00792}
}
Building on recent work on capsule networks, we propose a new, general-purpose form of "routing by agreement" that activates output capsules in a layer as a function of their net benefit to use and net cost to ignore input capsules from earlier layers. To illustrate the usefulness of our routing algorithm, we present two capsule networks that apply it in different domains: vision and language. The first network achieves new state-of-the-art accuracy of 99.1% on the smallNORB visual recognition… Expand
BERT Multilingual and Capsule Network for Arabic Sentiment Analysis
Sentiment Analysis (SA) is one of the fast-growing research tasks in Natural Language Processing (NLP), which aims to identify the attitude of online users or communities regarding to a specificExpand
ÖRTÜŞME VE DEFORME DURUMLARINDA KAPSÜL AĞ İLE EVRİŞİMSEL SİNİR AĞ SINIFLAMA PERFORMANSLARININ KARŞILAŞTIRILMASI
Evrisimsel sinir agi (ESA) ve Kapsul Ag (KA) onemli derin ogrenme mimarileridir. Bu makalede, ESA ve KA mimarilerinin Mnist ve Fashion mnist veri kumelerindeki ortusme ve deformasyon durumlarindaExpand

References

SHOWING 1-10 OF 24 REFERENCES
Dynamic Routing Between Capsules
TLDR
It is shown that a discrimininatively trained, multi-layer capsule system achieves state-of-the-art performance on MNIST and is considerably better than a convolutional net at recognizing highly overlapping digits. Expand
Matrix capsules with EM routing
Attention is All you Need
TLDR
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. Expand
Searching for Activation Functions
TLDR
The experiments show that the best discovered activation function, f(x) = x \cdot \text{sigmoid}(\beta x)$, which is named Swish, tends to work better than ReLU on deeper models across a number of challenging datasets. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
TLDR
This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities. Expand
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
TLDR
A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network. Expand
Language Models are Unsupervised Multitask Learners
TLDR
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
Visualizing and Measuring the Geometry of BERT
TLDR
This paper describes qualitative and quantitative investigations of one particularly effective model, BERT, and finds evidence of a fine-grained geometric representation of word senses in both attention matrices and individual word embeddings. Expand
XLNet: Generalized Autoregressive Pretraining for Language Understanding
TLDR
XLNet is proposed, a generalized autoregressive pretraining method that enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and overcomes the limitations of BERT thanks to its autore progressive formulation. Expand
...
1
2
3
...