# A dual coordinate descent method for large-scale linear SVM

@inproceedings{Hsieh2008ADC, title={A dual coordinate descent method for large-scale linear SVM}, author={Cho-Jui Hsieh and Kai-Wei Chang and Chih-Jen Lin and S. Sathiya Keerthi and S. Sundararajan}, booktitle={International Conference on Machine Learning}, year={2008} }

In many applications, data appear with a huge number of instances as well as features. [] Key Method The proposed method is simple and reaches an ε-accurate solution in O(log(1/ε)) iterations. Experiments indicate that our method is much faster than state of the art solvers such as Pegasos, TRON, SVMperf, and a recent primal coordinate descent implementation.

## 946 Citations

### A sequential dual method for large scale multi-class linear svms

- Computer ScienceKDD
- 2008

Experiments indicate that the main idea is to sequentially traverse through the training set and optimize the dual variables associated with one example at a time, much faster than state of the art solvers such as bundle, cutting plane and exponentiated gradient methods.

### Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines

- Computer ScienceJ. Mach. Learn. Res.
- 2008

A novel coordinate descent algorithm for training linear SVM with the L2-loss function that is more efficient and stable than state of the art methods such as Pegasos and TRON.

### New primal SVM solver with linear computational cost for big data classifications

- Computer ScienceICML 2014
- 2014

This paper presents a new L2- norm regularized primal SVM solver using Augmented Lagrange Multipliers, with linear computational cost for Lp-norm loss functions.

### Stochastic Sequential Minimal Optimization for Large-Scale Linear SVM

- Computer ScienceICONIP
- 2017

Experiments indicate that the proposed algorithm is much faster than some state of the art solvers, such as Liblinear, and achieves higher classification accuracy.

### Random primal-dual proximal iterations for sparse multiclass SVM

- Computer Science2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP)
- 2016

This paper proposes two block-coordinate descent strategies for learning a sparse multiclass support vector machine by selecting a subset of features to be updated at each iteration, while the second one performs the selection among the training samples.

### Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets

- Computer ScienceICML
- 2010

Comprehensive experimental results show that the proposed method can obtain better or competitive performance compared with existing SVM-based feature selection methods in term of sparsity and generalization performance, and can effectively handle large-scale and extremely high dimensional problems.

### Large-Scale Support Vector Machines: Algorithms and Theory

- Computer Science
- 2009

This document surveys work on SVM training methods that target this large-scale learning regime, and discusses why SGD generalizes well even though it is poor at optimization, and describes algorithms such as Pegasos and FOLOS that extend basic SGD to quickly solve the SVM problem.

### Large-Scale Elastic Net Regularized Linear Classification SVMs and Logistic Regression

- Computer Science2013 IEEE 13th International Conference on Data Mining
- 2013

Experiments indicate that the proposed dual coordinate descent - projection (DCD-P) methods are fast and achieve comparable generalization performance after the first pass through the data, with extremely sparse models.

### Solving Linear SVMs with Multiple 1D Projections

- Computer ScienceCIKM
- 2014

A new methodology for solving linear Support Vector Machines (SVMs) that capitalizes on multiple 1D projections and provides a comparable or better approximation factor of the optimal solution and exhibits smooth convergence properties is presented.

## References

SHOWING 1-10 OF 26 REFERENCES

### Coordinate Descent Method for Large-scale L 2-loss Linear SVM

- Computer Science
- 2008

A novel coordinate descent algorithm for training linear SVM with the L2-loss function that minimizes a one-variable sub-problem while fixing other variables and globally converges at the linear rate.

### A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

- Computer ScienceJ. Mach. Learn. Res.
- 2005

A fast method for solving linear SVMs with L2 loss function that is suited for large scale data mining tasks such as text classification is developed by modifying the finite Newton method of Mangasarian in several ways.

### Making large scale SVM learning practical

- Computer Science
- 1998

This chapter presents algorithmic and computational results developed for SVM light V 2.0, which make large-scale SVM training more practical and give guidelines for the application of SVMs to large domains.

### Training linear SVMs in linear time

- Computer ScienceKDD '06
- 2006

A Cutting Plane Algorithm for training linear SVMs that provably has training time 0(s,n) for classification problems and o(sn log (n)) for ordinal regression problems and several orders of magnitude faster than decomposition methods like svm light for large datasets.

### Bundle Methods for Machine Learning

- Computer ScienceNIPS
- 2007

This work presents a globally convergent method that applies to Support Vector estimation, regression, Gaussian Processes, and any other regularized risk minimization setting which leads to a convex optimization problem and presents tight convergence bounds, which show that the algorithm converges in O(1/∊) steps to ∊ precision for general convex problems.

### Solving large scale linear prediction problems using stochastic gradient descent algorithms

- Computer ScienceICML
- 2004

Stochastic gradient descent algorithms on regularized forms of linear prediction methods, related to online algorithms such as perceptron, are studied, and numerical rate of convergence for such algorithms is obtained.

### Successive overrelaxation for support vector machines

- Computer ScienceIEEE Trans. Neural Networks
- 1999

Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two…

### Fast training of support vector machines using sequential minimal optimization, advances in kernel methods

- Computer Science
- 1999

SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets.

### Decomposition Methods for Linear Support Vector Machines

- Computer ScienceNeural Computation
- 2004

It is shown that decomposition methods withalpha seeding are extremely useful for solving a sequence of linear support vector machines (SVMs) with more data than attributes and why alpha seeding is much more effective for linear than nonlinear SVMs.

### Solving multiclass support vector machines with LaRank

- Computer ScienceICML '07
- 2007

The LaRank algorithm sidesteps this difficulty by relying on a randomized exploration inspired by the perceptron algorithm, and shows that this approach is competitive with gradient based optimizers on simple multiclass problems.