• Corpus ID: 143423650

On Expected Accuracy

@article{Irsoy2019OnEA,
  title={On Expected Accuracy},
  author={Ozan Irsoy},
  journal={ArXiv},
  year={2019},
  volume={abs/1905.00448}
}
  • Ozan Irsoy
  • Published 1 May 2019
  • Computer Science
  • ArXiv
We empirically investigate the (negative) expected accuracy as an alternative loss function to cross entropy (negative log likelihood) for classification tasks. Coupled with softmax activation, it has small derivatives over most of its domain, and is therefore hard to optimize. A modified, leaky version is evaluated on a variety of classification tasks, including digit recognition, image classification, sequence tagging and tree tagging, using a variety of neural architectures such as logistic… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 35 REFERENCES

A Note on Margin-based Loss Functions in Classification by

It is shown that the hinge loss is the tightest convex upper bound of the misclassification loss, and the Fisher consistency of margin-based loss functions often leads to consistency and rate of convergence (to the Bayes optimal risk) results under general conditions.

Robust Truncated Hinge Loss Support Vector Machines

The robust truncated hinge loss SVM (RSVM) is proposed, which is shown to be more robust to outliers and to deliver more accurate classifiers using a smaller set of SVs than the standard SVM.

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Dropout: a simple way to prevent neural networks from overfitting

It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

Bias, Variance , And Arcing Classifiers

This work explores two arcing algorithms, compares them to each other and to bagging, and tries to understand how arcing works, which is more sucessful than bagging in variance reduction.

Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

The Tree-LSTM is introduced, a generalization of LSTMs to tree-structured network topologies that outperform all existing systems and strong LSTM baselines on two tasks: predicting the semantic relatedness of two sentences and sentiment classification.

Bidirectional recurrent neural networks

It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.

Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.

Loss Functions for Binary Class Probability Estimation and Classification: Structure and Applications

The use of proper scoring rules with novel criteria for 1) Hand and Vinciotti’s (2003) localized logistic regression and 2) for interpretable classification trees are illustrated.