A dual coordinate descent method for large-scale linear SVM

  title={A dual coordinate descent method for large-scale linear SVM},
  author={Cho-Jui Hsieh and Kai-Wei Chang and Chih-Jen Lin and S. Sathiya Keerthi and S. Sundararajan},
  booktitle={International Conference on Machine Learning},
In many applications, data appear with a huge number of instances as well as features. [] Key Method The proposed method is simple and reaches an ε-accurate solution in O(log(1/ε)) iterations. Experiments indicate that our method is much faster than state of the art solvers such as Pegasos, TRON, SVMperf, and a recent primal coordinate descent implementation.

Figures and Tables from this paper

A sequential dual method for large scale multi-class linear svms

Experiments indicate that the main idea is to sequentially traverse through the training set and optimize the dual variables associated with one example at a time, much faster than state of the art solvers such as bundle, cutting plane and exponentiated gradient methods.

Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines

A novel coordinate descent algorithm for training linear SVM with the L2-loss function that is more efficient and stable than state of the art methods such as Pegasos and TRON.

New primal SVM solver with linear computational cost for big data classifications

This paper presents a new L2- norm regularized primal SVM solver using Augmented Lagrange Multipliers, with linear computational cost for Lp-norm loss functions.

Large-scale linear nonparallel support vector machine solver

Stochastic Sequential Minimal Optimization for Large-Scale Linear SVM

Experiments indicate that the proposed algorithm is much faster than some state of the art solvers, such as Liblinear, and achieves higher classification accuracy.

Random primal-dual proximal iterations for sparse multiclass SVM

This paper proposes two block-coordinate descent strategies for learning a sparse multiclass support vector machine by selecting a subset of features to be updated at each iteration, while the second one performs the selection among the training samples.

Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets

Comprehensive experimental results show that the proposed method can obtain better or competitive performance compared with existing SVM-based feature selection methods in term of sparsity and generalization performance, and can effectively handle large-scale and extremely high dimensional problems.

Large-Scale Support Vector Machines: Algorithms and Theory

This document surveys work on SVM training methods that target this large-scale learning regime, and discusses why SGD generalizes well even though it is poor at optimization, and describes algorithms such as Pegasos and FOLOS that extend basic SGD to quickly solve the SVM problem.

Large-Scale Elastic Net Regularized Linear Classification SVMs and Logistic Regression

  • B. Palaniappan
  • Computer Science
    2013 IEEE 13th International Conference on Data Mining
  • 2013
Experiments indicate that the proposed dual coordinate descent - projection (DCD-P) methods are fast and achieve comparable generalization performance after the first pass through the data, with extremely sparse models.

Solving Linear SVMs with Multiple 1D Projections

A new methodology for solving linear Support Vector Machines (SVMs) that capitalizes on multiple 1D projections and provides a comparable or better approximation factor of the optimal solution and exhibits smooth convergence properties is presented.



Coordinate Descent Method for Large-scale L 2-loss Linear SVM

A novel coordinate descent algorithm for training linear SVM with the L2-loss function that minimizes a one-variable sub-problem while fixing other variables and globally converges at the linear rate.

A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

A fast method for solving linear SVMs with L2 loss function that is suited for large scale data mining tasks such as text classification is developed by modifying the finite Newton method of Mangasarian in several ways.

Making large scale SVM learning practical

This chapter presents algorithmic and computational results developed for SVM light V 2.0, which make large-scale SVM training more practical and give guidelines for the application of SVMs to large domains.

Training linear SVMs in linear time

A Cutting Plane Algorithm for training linear SVMs that provably has training time 0(s,n) for classification problems and o(sn log (n)) for ordinal regression problems and several orders of magnitude faster than decomposition methods like svm light for large datasets.

Bundle Methods for Machine Learning

This work presents a globally convergent method that applies to Support Vector estimation, regression, Gaussian Processes, and any other regularized risk minimization setting which leads to a convex optimization problem and presents tight convergence bounds, which show that the algorithm converges in O(1/∊) steps to ∊ precision for general convex problems.

Solving large scale linear prediction problems using stochastic gradient descent algorithms

Stochastic gradient descent algorithms on regularized forms of linear prediction methods, related to online algorithms such as perceptron, are studied, and numerical rate of convergence for such algorithms is obtained.

Successive overrelaxation for support vector machines

Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two

Fast training of support vector machines using sequential minimal optimization, advances in kernel methods

SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets.

Decomposition Methods for Linear Support Vector Machines

It is shown that decomposition methods withalpha seeding are extremely useful for solving a sequence of linear support vector machines (SVMs) with more data than attributes and why alpha seeding is much more effective for linear than nonlinear SVMs.

Solving multiclass support vector machines with LaRank

The LaRank algorithm sidesteps this difficulty by relying on a randomized exploration inspired by the perceptron algorithm, and shows that this approach is competitive with gradient based optimizers on simple multiclass problems.