Composite Difference-Max Programs for Modern Statistical Estimation Problems

  title={Composite Difference-Max Programs for Modern Statistical Estimation Problems},
  author={Ying Cui and J. S. Pang and Bodhisattva Sen},
  journal={SIAM J. Optim.},
Many modern statistical estimation problems are defined by three major components: a statistical model that postulates the dependence of an output variable on the input features; a loss function measuring the error between the observed output and the model predicted output; and a regularizer that controls the overfitting and/or variable selection in the model. We study the sampling version of this generic statistical estimation problem where the model parameters are estimated by empirical risk… 
A Global Two-stage Algorithm for Non-convex Penalized High-dimensional Linear Regression Problems
  • Peili Li, Min Liu, Zhou Yu
  • Mathematics
  • 2021
By the asymptotic oracle property, non-convex penalties represented by minimax concave penalty (MCP) and smoothly clipped absolute deviation (SCAD) have attracted much attentions in high-dimensional
On the pervasiveness of difference-convexity in optimization and statistics
This paper establishes the dc property of many functions in various areas of applications not previously known to be of this class, and significantly expands the classes of dc functions that are employed as approximations of the $$\ell _0$$ℓ0 function in statistical learning.
Statistical Analysis of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization
The results offer a first step to close the gap between computational optimization and asymptotic analysis of coupled nonconvex nonsmooth statistical estimation problems, expanding the former with statistical properties of the practically obtained solution and providing the latter with a more practical focus pertaining to computational tractability.
A Proximal Distance Algorithm for Likelihood-Based Sparse Covariance Estimation
This paper addresses the task of estimating a covariance matrix under a patternless sparsity assumption. In contrast to existing approaches based on thresholding or shrinkage penalties, we propose a
Nonconvex robust programming via value-function optimization
A systematic approach for tackling the nonconvexity of the robust optimization problems that is usually coupled with the nonsmoothness of the objective function brought by the worst-case value function is introduced.
Approximations of semicontinuous functions with applications to stochastic optimization and statistical estimation
  • J. Royset
  • Mathematics, Computer Science
    Math. Program.
  • 2020
It is established that every usc function is the limit of a hypo-converging sequence of piecewise affine functions of the difference-of-max type and resulting algorithmic possibilities in the context of approximate solution of infinite-dimensional optimization problems are illustrated.
Faster Convex Lipschitz Regression via 2-block ADMM
A broad class of convex function learning problems can be solved via a 2-block ADMM approach, where updates for each block can be computed in closed form, and is amenable to the use of GPUs.
On the Linear Convergence of Difference-of-convex Algorithms for Nonsmooth DC Programming
In this paper we consider the linear convergence of algorithms for minimizing differenceof-convex functions with convex constraints. We allow nonsmoothness in both of the convex and concave
Pursuit of the Cluster Structure of Network Lasso: Recovery Condition and Non-convex Extension
A non-convex extension of Network Trimmed Lasso is developed which provides more clear-cut cluster structure when NL fails to form clusters without incorporating prior knowledge on the associated parameters.
Understanding Notions of Stationarity in Non-Smooth Optimization
This article gives an introduction to different stationarity concepts for several important classes of non-convex non-smooth functions and discusses the geometric interpretations and further clarify the relationship among these different concepts.


Bayesian nonparametric multivariate convex regression
In many applications, such as economics, operations research and reinforcement learning, one often needs to estimate a multivariate regression function f subject to a convexity constraint. For
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Variable selection is fundamental to high-dimensional statistical modeling, including nonparametric regression. Many approaches in use are stepwise selection procedures, which can be computationally
A Computational Framework for Multivariate Convex Regression and Its Variants
ABSTRACT We study the nonparametric least squares estimator (LSE) of a multivariate convex regression function. The LSE, given as the solution to a quadratic program with O(n2) linear constraints (n
Difference-of-Convex Learning: Directional Stationarity, Optimality, and Sparsity
This paper establishes some fundamental optimality and sparsity properties of directional stationary solutions to a nonconvex Lagrangian formulation of the bicriteria optimization problem, based on a specially structured dc representation of many well-known sparsity functions that can be profitably exploited in the analysis.
Incremental Majorization-Minimization Optimization with Application to Large-Scale Machine Learning
  • J. Mairal
  • Computer Science, Mathematics
    SIAM J. Optim.
  • 2015
This work proposes an incremental majorization-minimization scheme for minimizing a large sum of continuous functions, a problem of utmost importance in machine learning, and presents convergence guarantees for nonconvex and convex optimization when the upper bounds approximate the objective up to a smooth error.
Parameter Estimation and Inference in a Continuous Piecewise Linear Regression Model
The estimation of regression parameters in one dimensional broken stick models is a research area of statistics with an extensive literature. We are interested in extending such models by aiming to
A Tutorial on MM Algorithms
Most problems in frequentist statistics involve optimization of a function such as a likelihood or a sum of squares. EM algorithms are among the most effective algorithms for maximum likelihood
Statistical Learning with Sparsity: The Lasso and Generalizations
Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data and extract useful and reproducible patterns from big datasets.
An algorithm for the estimation of a regression function by continuous piecewise linear functions
The objective function in the optimization problem is semismooth, quasidifferentiable and piecewise partially separable, which allows an efficient algorithm for approximation of subgradients of the objective function and to apply the discrete gradient method for its minimization.
Structural properties of affine sparsity constraints
A number of fundamental structural properties of the solution set of such a non-standard system of inequalities, including its closedness and the description of its closure, continuous approximations and their set convergence, and characterizations of its tangent cones are investigated for use in optimization.