# Unbiased Loss Functions for Extreme Classification With Missing Labels

@article{Schultheis2020UnbiasedLF, title={Unbiased Loss Functions for Extreme Classification With Missing Labels}, author={Erik Schultheis and Mohammadreza Qaraei and Priyanshu Gupta and Rohit Babbar}, journal={ArXiv}, year={2020}, volume={abs/2007.00237} }

The goal in extreme multi-label classification (XMC) is to tag an instance with a small subset of relevant labels from an extremely large set of possible labels. In addition to the computational burden arising from large number of training instances, features and labels, problems in XMC are faced with two statistical challenges, (i) large number of 'tail-labels' -- those which occur very infrequently, and (ii) missing labels as it is virtually impossible to manually assign every relevant label…

## 2 Citations

Needle in a Haystack: Label-Efficient Evaluation under Extreme Class Imbalance

- Computer ScienceKDD
- 2021

This paper develops a framework for online evaluation based on adaptive importance sampling that establishes strong consistency and a central limit theorem for the resulting performance estimates, and instantiate the framework with worked examples that leverage Dirichlet-tree models.

Prediction in the Presence of Response-Dependent Missing Labels

- Computer Science, Environmental Science2021 IEEE Statistical Signal Processing Workshop (SSP)
- 2021

This work develops a new methodology and non-convex algorithm P(ositive) U(nlabeled) O(ccurrence) M(agnitude) M (ixture) which jointly estimates the occurrence and detection likelihood of positive samples, utilizing prior knowledge of the detection mechanism.

## References

SHOWING 1-10 OF 41 REFERENCES

Data scarcity, robustness and extreme multi-label classification

- Computer ScienceMachine Learning
- 2019

It is shown that minimizing Hamming loss with appropriate regularization surpasses many state-of-the-art methods for tail-labels detection in XMC and the spectral properties of label graphs are investigated for providing novel insights towards understanding the conditions governing the performance of Hamming losses based one-vs-rest scheme.

Extreme Multi-label Loss Functions for Recommendation, Tagging, Ranking & Other Missing Label Applications

- Computer ScienceKDD
- 2016

The choice of the loss function is critical in extreme multi-label learning where the objective is to annotate each data point with the most relevant subset of labels from an extremely large label…

Large-scale Multi-label Learning with Missing Labels

- Computer ScienceICML
- 2014

This paper studies the multi-label problem in a generic empirical risk minimization (ERM) framework and develops techniques that exploit the structure of specific loss functions - such as the squared loss function - to obtain efficient algorithms.

DiSMEC: Distributed Sparse Machines for Extreme Multi-label Classification

- Computer ScienceWSDM
- 2017

This work presents DiSMEC, which is a large-scale distributed framework for learning one-versus-rest linear classifiers coupled with explicit capacity control to control model size, and conducts extensive empirical evaluation on publicly available real-world datasets consisting upto 670,000 labels.

Sparse Local Embeddings for Extreme Multi-label Classification

- Computer ScienceNIPS
- 2015

The SLEEC classifier is developed for learning a small ensemble of local distance preserving embeddings which can accurately predict infrequently occurring (tail) labels and can make significantly more accurate predictions then state-of-the-art methods including both embedding-based as well as tree-based methods.

Does Tail Label Help for Large-Scale Multi-Label Learning

- Computer ScienceIJCAI
- 2018

A low-complexity large-scale multi-label learning algorithm is developed with the goal of facilitating fast prediction and compact models by trimming tail labels adaptively without sacrificing much predictive performance for state-of-the-art approaches.

A no-regret generalization of hierarchical softmax to extreme multi-label classification

- Computer ScienceNeurIPS
- 2018

It is shown that PLTs are a no-regret multi-label generalization of HSM when precision@$k$ is used as a model evaluation metric, and it is proved that pick-one-label heuristic---a reduction technique from multi- label to multi-class that is routinely used along with HSM---is not consistent in general.

PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification

- Computer ScienceICML
- 2016

A Fully-Corrective Block-Coordinate Frank-Wolfe (FC-BCFW) algorithm is proposed that exploits both Primal and dual sparsity to achieve a complexity sublinear to the number of primal and dual variables and achieves significant higher accuracy than existing approaches of Extreme Classification.

Bonsai: diverse and shallow trees for extreme multi-label classification

- Computer ScienceMachine Learning
- 2020

A suite of algorithms, called Bonsai, is developed, which generalizes the notion of label representation in XMC, and partitions the labels in the representation space to learn shallow trees, and achieves the best of both worlds.

Stochastic Negative Mining for Learning with Large Output Spaces

- Computer ScienceAISTATS
- 2019

This work defines a family of surrogate losses and shows that they are calibrated and convex under certain conditions on the loss parameters and data distribution, thereby establishing a statistical and analytical basis for using these losses.