SimLoss: Class Similarities in Cross Entropy

@article{Kobs2020SimLossCS,
  title={SimLoss: Class Similarities in Cross Entropy},
  author={Konstantin Kobs and Michael Steininger and Albin Zehe and Florian Lautenschlager and Andreas Hotho},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.03182}
}
One common loss function in neural network classification tasks is Categorical Cross Entropy (CCE), which punishes all misclassifications equally. However, classes often have an inherent structure. For instance, classifying an image of a rose as "violet" is better than as "truck". We introduce SimLoss, a drop-in replacement for CCE that incorporates class similarities along with two techniques to construct such matrices from task-specific knowledge. We test SimLoss on Age Estimation and Image… 

The Tree Loss: Improving Generalization with Many Classes

This work introduces the tree loss as a drop-in replacement for the cross entropy loss, which re-parameterizes the parameter matrix in order to guarantee that semantically similar classes will have similar parameter vectors.

Fine-grained TLS Services Classification with Reject Option

Malicious Software Detection Based on Improved Convolution Neural Network

  • Tianyue LiuHongqi ZhangHaixia Long
  • Computer Science
    2022 2nd International Conference on Frontiers of Electronics, Information and Computation Technologies (ICFEICT)
  • 2022
An improved deep learning method for CNN with Batch Normalization and Inception-Residual network (BIR-CNN), utilizing deep learning for the detection of dangerous software, BIR-CNN can be employed.

References

SHOWING 1-10 OF 17 REFERENCES

A hierarchical loss and its problems when classifying non-hierarchically

This work defines a metric that, inter alia, can penalize failure to distinguish between a sheepdog and a skyscraper more thanfailure to distinguish Between a sheepdogs and a poodles.

Hierarchical loss for classification

A metric that is based on an ultrametric tree associated with any given tree organization into a semantically meaningful hierarchy of a classifier's classes is defined, inter alia, which can penalize failure to distinguish between a sheepdog and a skyscraper more than failure to distinguishes between a Sheepdog and an poodle.

Age Progression/Regression by Conditional Adversarial Autoencoder

A conditional adversarial autoencoder that learns a face manifold, traversing on which smooth age progression and regression can be realized simultaneously is proposed, and the appealing performance and flexibility of the proposed framework is demonstrated by comparing with the state-of-the-art and ground truth.

Zero-Shot Learning by Convex Combination of Semantic Embeddings

A simple method for constructing an image embedding system from any existing image classifier and a semantic word embedding model, which contains the $\n$ class labels in its vocabulary is proposed, which outperforms state of the art methods on the ImageNet zero-shot learning task.

Training Convolutional Networks with Noisy Labels

An extra noise layer is introduced into the network which adapts the network outputs to match the noisy label distribution and can be estimated as part of the training process and involve simple modifications to current training infrastructures for deep networks.

Learning Multiple Layers of Features from Tiny Images

It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.

Generalization and Parameter Estimation in Feedforward Netws: Some Experiments

An empirical study of the relation of the number of parameters (weights) in a feedforward net to generalization performance and the application of cross-validation techniques to prevent overfitting is done.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

Ordinal Regression with Multiple Output CNN for Age Estimation

This paper proposes an End-to-End learning approach to address ordinal regression problems using deep Convolutional Neural Network, which could simultaneously conduct feature learning and regression modeling, and achieves the state-of-the-art performance on both the MORPH and AFAD datasets.

Incremental Algorithms for Hierarchical Classification

A new hierarchical loss function, the H-loss, is introduced, implementing the simple intuition that additional mistakes in the subtree of a mistaken class should not be charged for, based on a probabilistic data model introduced in earlier work.