Corpus ID: 212414790

Linear time dynamic programming for the exact path of optimal models selected from a finite set

  title={Linear time dynamic programming for the exact path of optimal models selected from a finite set},
  author={T. Hocking and Joseph Vargovich},
Many learning algorithms are formulated in terms of finding model parameters which minimize a data-fitting loss function plus a regularizer. When the regularizer involves the l0 pseudo-norm, the resulting regularization path consists of a finite set of models. The fastest existing algorithm for computing the breakpoints in the regularization path is quadratic in the number of models, so it scales poorly to high dimensional problems. We provide new formal proofs that a dynamic programming… Expand
1 Citations
Optimizing ROC Curves with a Sort-Based Surrogate Loss Function for Binary Classification and Changepoint Detection
This work proposes a convex relaxation of this objective that results in a new surrogate loss function called the AUM, short for Area Under Min(FP, FN), which requires a sort and a sum over the sequence of points on the ROC curve and is efficiently computed and used in a gradient descent learning algorithm. Expand


Computationally Efficient Changepoint Detection for a Range of Penalties
ABSTRACT In the multiple changepoint setting, various search methods have been proposed, which involve optimizing either a constrained or penalized cost function over possible numbers and locationsExpand
On optimal multiple changepoint algorithms for large data
Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. Expand
SparseNet: Coordinate Descent With Nonconvex Penalties
The properties of penalties suitable for this approach are characterized, their corresponding threshold functions are studied, and a df-standardizing reparametrization is described that assists the pathwise algorithm. Expand
An algorithm for optimal partitioning of data on an interval
This letter describes a simple but powerful algorithm that searches the exponentially large space of partitions of N data points in time O(N/sup 2/), which is guaranteed to find the exact global optimum. Expand
Best Subset Selection via a Modern Optimization Lens
In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving MixedExpand
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationallyExpand
SparseStep: Approximating the Counting Norm for Sparse Regularization
The SparseStep algorithm is presented for the estimation of a sparse parameter vector in the linear regression problem. The algorithm works by adding an approximation of the exact counting norm as aExpand
Fast Bayesian Matching Pursuit: Model Uncertainty and Parameter Estimation for Sparse Linear Models
A low-complexity recursive procedure is presented for model selection and minimum mean squared error (MMSE) estimation in linear regression. Emphasis is given to the case of a sparse parameter vectorExpand
Nearly unbiased variable selection under minimax concave penalty
We propose MC+, a fast, continuous, nearly unbiased and accurate method of penalized variable selection in high-dimensional linear regression. The LASSO is fast and continuous, but biased. The biasExpand
Data-driven Calibration of Penalties for Least-Squares Regression
A completely data-driven calibration algorithm for these parameters in the least-squares regression framework, without assuming a particular shape for the penalty, based on the concept of minimal penalty, recently introduced by Birge and Massart (2007). Expand