Maximum Class Separation as Inductive Bias in One Matrix

  title={Maximum Class Separation as Inductive Bias in One Matrix},
  author={Tejaswi Kasarla and Gertjan J. Burghouts and Max van Spengler and Elise van der Pol and Rita Cucchiara and Pascal Mettes},
Maximizing the separation between classes constitutes a well-known inductive bias in machine learning and a pillar of many traditional algorithms. By default, deep networks are not equipped with this inductive bias and therefore many alternative solutions have been proposed through differential optimization. Current approaches tend to optimize classification and separation jointly: aligning inputs with class vectors and separating class vectors angularly. This paper proposes a simple alternative… 

AI-based detection of DNS misuse for network security

This paper presents two AI-based Domain Generation Algorithm detection and classification techniques - a feature-based one, leveraging classic Machine Learning algorithms and a featureless one, based on Deep Learning - specifically intended to aid in this task.



Dissecting Supervised Constrastive Learning

This work proves, under mild assumptions, that both losses attain their minimum once the representations of each class collapse to the vertices of a regular simplex, inscribed in a hypersphere.

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

A theoretically-principled label-distribution-aware margin (LDAM) loss motivated by minimizing a margin-based generalization bound is proposed that replaces the standard cross-entropy objective during training and can be applied with prior strategies for training with class-imbalance such as re-weighting or re-sampling.

Ring Loss: Convex Feature Normalization for Face Recognition

This work motivates and presents Ring loss, a simple and elegant feature normalization approach for deep networks designed to augment standard loss functions such as Softmax, and applies soft normalization, where it gradually learns to constrain the norm to the scaled unit circle while preserving convexity leading to more robust features.

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

This paper presents arguably the most extensive experimental evaluation against all recent state-of-the-art face recognition methods on ten face recognition benchmarks, and shows that ArcFace consistently outperforms the state of the art and can be easily implemented with negligible computational overhead.

All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation

A variant of regularizer which utilizes orthonormality among different filter banks can alleviate the problem of gradient vanishing or exploding phenomenon, and a backward error modulation mechanism based on the quasi-isometry assumption between two consecutive parametric layers is designed.

Improving Calibration for Long-Tailed Recognition

Motivated by the fact that predicted probability distributions of classes are highly related to the numbers of class instances, this work proposes label-aware smoothing to deal with different degrees of over-confidence for classes and improve classifier learning.

Large-Margin Softmax Loss for Convolutional Neural Networks

A generalized large-margin softmax (L-Softmax) loss which explicitly encourages intra-class compactness and inter-class separability between learned features and which not only can adjust the desired margin but also can avoid overfitting is proposed.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

Experiments on ImageNet as well as downstream tasks prove the superiority of ViTAE over the baseline transformer and concurrent works and the intrinsic locality IB and is able to learn local features and global dependencies collaboratively.