# From Dual to Primal Sub-optimality for Regularized Empirical Risk Minimization

@inproceedings{Lee2016FromDT, title={From Dual to Primal Sub-optimality for Regularized Empirical Risk Minimization}, author={Ching-pei Lee}, year={2016} }

Regularized empirical risk minimization problems are fundamental tasks in machine learning and data analysis. Many successful approaches for solving these problems are based on a dual formulation, which often admits more efficient algorithms. Often, though, the primal solution is needed. In the case of regularized empirical risk minimization, there is a convenient formula for reconstructing an approximate primal solution from the approximate dual solution. However, the question of quantifying…

## One Citation

Large-scale logistic regression and linear support vector machines using spark

- Computer Science2014 IEEE International Conference on Big Data (Big Data)
- 2014

This work considers a distributed Newton method for solving logistic regression as well linear SVM and implements it on Spark, and releases an efficient and easy-to-use tool for the Spark community.

## References

SHOWING 1-10 OF 29 REFERENCES

QP Algorithms with Guaranteed Accuracy and Run Time for Support Vector Machines

- Computer Science, MathematicsJ. Mach. Learn. Res.
- 2006

Operational conditions for which the Simon and composite algorithms possess an upper bound of O(n) on the number of iterations are described and general conditions forwhich a matching lower bound exists for any decomposition algorithm that uses working sets of size 2 are described.

Approximate Duality

- Computer Science
- 2007

The Lagrangian duality theory is extended to incorporate approximate solutions of convex optimization problems and can be used for convex quadratic programming and then applied to support vector machines from learning theory.

Distributed Box-Constrained Quadratic Optimization for Dual Linear SVM

- Computer ScienceICML
- 2015

This paper proposes an efficient box-constrained quadratic optimization algorithm for distributedly training linear support vector machines (SVMs) with large data using an efficient method that requires only O(1) communication cost to ensure fast convergence.

Iteration complexity of feasible descent methods for convex optimization

- Computer ScienceJ. Mach. Learn. Res.
- 2014

The global linear convergence on a wide range of algorithms when they are applied to some non-strongly convex problems is proved and the first to prove O(log(1/e) time complexity of cyclic coordinate descent methods on dual problems of support vector classification and regression is proved.

Dual coordinate descent methods for logistic regression and maximum entropy models

- Computer ScienceMachine Learning
- 2010

This paper applies coordinate descent methods to solve the dual form of logistic regression and maximum entropy, and shows that many details are different from the situation in linear SVM.

Stochastic dual coordinate ascent methods for regularized loss

- Computer ScienceJ. Mach. Learn. Res.
- 2013

A new analysis of Stochastic Dual Coordinate Ascent (SDCA) is presented showing that this class of methods enjoy strong theoretical guarantees that are comparable or better than SGD.

Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

- Computer ScienceJ. Mach. Learn. Res.
- 2008

Examining exponentiated gradient algorithms for training log-linear and maximum-margin models describes how the EG updates factor in a convenient way for structured prediction problems, allowing the algorithms to be efficiently applied to problems such as sequence learning or natural language parsing.

A dual coordinate descent method for large-scale linear SVM

- Computer ScienceICML '08
- 2008

A novel dual coordinate descent method for linear SVM with L1-and L2-loss functions that reaches an ε-accurate solution in O(log(1/ε)) iterations is presented.

PASSCoDe: Parallel ASynchronous Stochastic dual Co-ordinate Descent

- Computer ScienceICML
- 2015

This paper proposes a family of parallel asynchronous stochastic dual coordinate descent algorithms (PASSCoDe), showing that the converged solution is the exact solution for a primal problem with a perturbed regularizer under the multi-core environment.

Introductory Lectures on Convex Optimization - A Basic Course

- Computer ScienceApplied Optimization
- 2004

It was in the middle of the 1980s, when the seminal paper by Kar markar opened a new epoch in nonlinear optimization, and it became more and more common that the new methods were provided with a complexity analysis, which was considered a better justification of their efficiency than computational experiments.