# A dual coordinate descent method for large-scale linear SVM

@inproceedings{Hsieh2008ADC, title={A dual coordinate descent method for large-scale linear SVM}, author={Cho-Jui Hsieh and Kai-Wei Chang and Chih-Jen Lin and S. Sathiya Keerthi and S. Sundararajan}, booktitle={ICML '08}, year={2008} }

In many applications, data appear with a huge number of instances as well as features. [...] Key Method The proposed method is simple and reaches an ε-accurate solution in O(log(1/ε)) iterations. Experiments indicate that our method is much faster than state of the art solvers such as Pegasos, TRON, SVMperf, and a recent primal coordinate descent implementation. Expand

#### 908 Citations

A sequential dual method for large scale multi-class linear svms

- Computer Science
- KDD
- 2008

Experiments indicate that the main idea is to sequentially traverse through the training set and optimize the dual variables associated with one example at a time, much faster than state of the art solvers such as bundle, cutting plane and exponentiated gradient methods. Expand

Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines

- Computer Science
- J. Mach. Learn. Res.
- 2008

A novel coordinate descent algorithm for training linear SVM with the L2-loss function that is more efficient and stable than state of the art methods such as Pegasos and TRON. Expand

New primal SVM solver with linear computational cost for big data classifications

- Computer Science
- ICML 2014
- 2014

This paper presents a new L2- norm regularized primal SVM solver using Augmented Lagrange Multipliers, with linear computational cost for Lp-norm loss functions. Expand

Large-scale linear nonparallel support vector machine solver

- Computer Science, Medicine
- Neural Networks
- 2014

A Sparse Linear Nonparallel Support Vector Machine, termed as L1-NPSVM, to deal with large-scale data based on an efficient solver-dual coordinate descent (DCD) method is proposed and both theoretical analysis and experiments indicate that it performs as good as TWSVMs and SVMs. Expand

Stochastic Sequential Minimal Optimization for Large-Scale Linear SVM

- Computer Science
- ICONIP
- 2017

Experiments indicate that the proposed algorithm is much faster than some state of the art solvers, such as Liblinear, and achieves higher classification accuracy. Expand

Random primal-dual proximal iterations for sparse multiclass SVM

- Computer Science
- 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP)
- 2016

This paper proposes two block-coordinate descent strategies for learning a sparse multiclass support vector machine by selecting a subset of features to be updated at each iteration, while the second one performs the selection among the training samples. Expand

Learning Sparse SVM for Feature Selection on Very High Dimensional Datasets

- Computer Science
- ICML
- 2010

Comprehensive experimental results show that the proposed method can obtain better or competitive performance compared with existing SVM-based feature selection methods in term of sparsity and generalization performance, and can effectively handle large-scale and extremely high dimensional problems. Expand

Large-Scale Support Vector Machines: Algorithms and Theory

- Computer Science
- 2009

This document surveys work on SVM training methods that target this large-scale learning regime, and discusses why SGD generalizes well even though it is poor at optimization, and describes algorithms such as Pegasos and FOLOS that extend basic SGD to quickly solve the SVM problem. Expand

Large-Scale Elastic Net Regularized Linear Classification SVMs and Logistic Regression

- Computer Science, Mathematics
- 2013 IEEE 13th International Conference on Data Mining
- 2013

Experiments indicate that the proposed dual coordinate descent - projection (DCD-P) methods are fast and achieve comparable generalization performance after the first pass through the data, with extremely sparse models. Expand

Solving Linear SVMs with Multiple 1D Projections

- Computer Science
- CIKM
- 2014

A new methodology for solving linear Support Vector Machines (SVMs) that capitalizes on multiple 1D projections and provides a comparable or better approximation factor of the optimal solution and exhibits smooth convergence properties is presented. Expand

#### References

SHOWING 1-10 OF 31 REFERENCES

Coordinate Descent Method for Large-scale L 2-loss Linear SVM

- 2008

Linear support vector machines (SVM) are useful for classifying largescale sparse data. Problems with sparse features are common in applications such as document classification and natural language… Expand

A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2005

A fast method for solving linear SVMs with L2 loss function that is suited for large scale data mining tasks such as text classification is developed by modifying the finite Newton method of Mangasarian in several ways. Expand

Making large scale SVM learning practical

- Computer Science
- 1998

This chapter presents algorithmic and computational results developed for SVM light V 2.0, which make large-scale SVM training more practical and give guidelines for the application of SVMs to large domains. Expand

Training linear SVMs in linear time

- Mathematics, Computer Science
- KDD '06
- 2006

A Cutting Plane Algorithm for training linear SVMs that provably has training time 0(s,n) for classification problems and o(sn log (n)) for ordinal regression problems and several orders of magnitude faster than decomposition methods like svm light for large datasets. Expand

Bundle Methods for Machine Learning

- Computer Science, Mathematics
- NIPS
- 2007

This work presents a globally convergent method that applies to Support Vector estimation, regression, Gaussian Processes, and any other regularized risk minimization setting which leads to a convex optimization problem and presents tight convergence bounds, which show that the algorithm converges in O(1/∊) steps to ∊ precision for general convex problems. Expand

Solving large scale linear prediction problems using stochastic gradient descent algorithms

- Mathematics, Computer Science
- ICML
- 2004

Stochastic gradient descent algorithms on regularized forms of linear prediction methods, related to online algorithms such as perceptron, are studied, and numerical rate of convergence for such algorithms is obtained. Expand

Decomposition methods for linear support vector machines

- Mathematics, Computer Science
- 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
- 2003

It is shown that alpha seeding is extremely useful for solving a sequence of linear SVMs, and largely reduces the number of decomposition iterations to the point that solving manylinear SVMs requires less time than the original decomposition method for one single SVM. Expand

Successive overrelaxation for support vector machines

- Mathematics, Computer Science
- IEEE Trans. Neural Networks
- 1999

Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two… Expand

Fast training of support vector machines using sequential minimal optimization, advances in kernel methods

- Mathematics, Computer Science
- 1999

SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets. Expand

Solving multiclass support vector machines with LaRank

- Computer Science
- ICML '07
- 2007

The LaRank algorithm sidesteps this difficulty by relying on a randomized exploration inspired by the perceptron algorithm, and shows that this approach is competitive with gradient based optimizers on simple multiclass problems. Expand