• Corpus ID: 12540362

Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification

  title={Aggressive Sampling for Multi-class to Binary Reduction with Applications to Text Classification},
  author={Bikash Joshi and Massih-Reza Amini and Ioannis Partalas and Franck Iutzeler and Yury Maximov},
We address the problem of multi-class classification in the case where the number of classes is very large. [] Key Method We show that this strategy does not alter the consistency of the empirical risk minimization principle defined over the double sample reduction. Experiments are carried out on DMOZ and Wikipedia collections with 10,000 to 100,000 classes where we show the efficiency of the proposed approach in terms of training and prediction time, memory consumption, and predictive performance with…

Figures and Tables from this paper

MEMOIR: Multi-class Extreme Classification with Inexact Margin

The impact of computing an approximate margin using nearest neighbor (ANN) search structures combined with locality-sensitive hashing (LSH) is studied, showing that the proposed approach is highly competitive with respect to state-of-the-art approaches on time, memory and performance measures.

Correlation-Guided Representation for Multi-Label Text Classification

The task is viewed as a correlation-guided text representation problem: an attention-based two-step framework is proposed to integrate text information and label semantics by jointly learning words and labels in the same space to capture high-order label-label correlations as well as context- label correlations.

Softmax Tree: An Accurate, Fast Classifier When the Number of Classes Is Large

Although learning accurate tree-based models has proven difficult in the past, this work is able to overcome this by using a variation of a recent algorithm, tree alternating optimization (TAO), which is both more accurate in prediction and faster in inference, as shown in NLP problems having from one thousand to one hundred thousand classes.

Scalable algorithms for large-scale machine learning problems : Application to multiclass classification and asynchronous distributed optimization. (Algorithmes d'apprentissage pour les grandes masses de données : Application à la classification multi-classes et à l'optimisation distribuée asynchron

A scalable method to tackle the multiclass classification problem for very large number of classes and perform detailed theoretical and empirical analyses is introduced and an asynchronous framework for performing distributed optimization is introduced.

Survey on Multi-Output Learning

The four Vs of multi-output learning are characterized, i.e., volume, velocity, variety, and veracity, and the ways in which the four Vs both benefit and bring challenges to multi- output learning by taking inspiration from big data are examined.

Few-Shot Visual Classification Using Image Pairs With Binary Transformation

A novel visual classification method using image pairs with binary transformation (IPBT) to classify images using few-shot samples by concatenating the representations of the two images along with their similarity.

Unbiased scalable softmax optimization

This paper proposes the first unbiased algorithms for maximizing the softmax likelihood whose work per iteration is independent of the number of classes and datapoints (and no extra work is required at the end of each epoch).

Learning over no-Preferred and Preferred Sequence of items for Robust Recommendation

This paper proposes a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks, and presents two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach.

NIPS - Not Even Wrong? A Systematic Review of Empirically Complete Demonstrations of Algorithmic Effectiveness in the Machine Learning and Artificial Intelligence Literature

Using the 2017 sample of the NeurIPS supervised learning corpus as an indicator for the quality and trustworthiness of current ML/AI research, it appears that complete argumentative chains in demonstrations of algorithmic effectiveness are rare.

Deep Learning for Adverse Event Detection From Web Search

Evaluation results on three large real-world event datasets show that DeepSAVE outperforms existing detection methods as well as comparison deep learning auto encoders and demonstrates the viability of the proposed architecture for detecting adverse events from search query logs.



On Binary Reduction of Large-Scale Multiclass Classification Problems

The efficiency of the deduced algorithm compared to state-of-the-art multiclass classification strategies on two large-scale document collections especially in the interesting case where the number of classes becomes very large.

Extreme Multi Class Classification

This work proposes a reduction of the multi class classification problem to a set of binary regression problems organized in a tree structure and introduces a simple top-down criterion for purification of labels that allows for gradient descent style optimization.

Label Embedding Trees for Large Multi-Class Tasks

An algorithm for learning a tree-structure of classifiers which, by optimizing the overall tree loss, provides superior accuracy to existing tree labeling methods and a method that learns to embed labels in a low dimensional space that is faster than non-embedding approaches and has superior accuracyto existing embedding approaches are proposed.

Large-scale Multi-label Learning with Missing Labels

This paper studies the multi-label problem in a generic empirical risk minimization (ERM) framework and develops techniques that exploit the structure of specific loss functions - such as the squared loss function - to obtain efficient algorithms.

Logarithmic Time Online Multiclass prediction

It is demonstrated that under favorable conditions, this work can construct logarithmic depth trees that have leaves with low label entropy, and a new objective function is formulated, which is optimized at each node of the tree and creates dynamic partitions of the data.

An iterative method for multi-class cost-sensitive learning

This paper empirically evaluates the performance of the proposed method using benchmark data sets and proves that the method generally achieves better results than representative methods for cost-sensitive learning, in terms of predictive performance (cost minimization) and, in many cases, computational efficiency.

Logarithmic Time One-Against-Some

It is shown that several simple techniques give rise to an algorithm that can compete with one-against-all in both space and predictive power while offering exponential improvements in speed when the number of classes is large.

A review on the combination of binary classifiers in multiclass problems

This paper presents a survey on the main strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass classification problems, and focuses on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final prediction.

PD-Sparse : A Primal and Dual Sparse Approach to Extreme Multiclass and Multilabel Classification

A Fully-Corrective Block-Coordinate Frank-Wolfe (FC-BCFW) algorithm is proposed that exploits both Primal and dual sparsity to achieve a complexity sublinear to the number of primal and dual variables and achieves significant higher accuracy than existing approaches of Extreme Classification.

Sparse Local Embeddings for Extreme Multi-label Classification

The SLEEC classifier is developed for learning a small ensemble of local distance preserving embeddings which can accurately predict infrequently occurring (tail) labels and can make significantly more accurate predictions then state-of-the-art methods including both embedding-based as well as tree-based methods.