• Corpus ID: 12204193

Oracle inequalities for computationally adaptive model selection

@article{Agarwal2012OracleIF,
  title={Oracle inequalities for computationally adaptive model selection},
  author={Alekh Agarwal and Peter L. Bartlett and John C. Duchi},
  journal={ArXiv},
  year={2012},
  volume={abs/1208.0129}
}
We analyze general model selection procedures using penalized empirical loss minimization under computational constraints. While classical model selection approaches do not consider computational aspects of performing model selection, we argue that any practical model selection procedure must not only trade off estimation and approximation error, but also the computational effort required to compute empirical minimizers for different function classes. We provide a framework for analyzing such… 

Figures from this paper

Designing Statistical Estimators That Balance Sample Size, Risk, and Computational Cost
TLDR
This work uses regularized linear regression as a case study to argue for the existence of a tradeoff between computational time, sample complexity, and statistical accuracy that applies to statistical estimators based on convex optimization.
Non-stochastic Best Arm Identification and Hyperparameter Optimization
TLDR
This work casts hyperparameter optimization as an instance of non-stochastic best-arm identification, identifies a known algorithm that is well-suited for this setting, and empirically evaluates its behavior.
Generalized Rank-Breaking: Computational and Statistical Tradeoffs
TLDR
To compute the pseudo-maximum likelihood estimate of the set-wise comparisons, this work provides a generalization of the minorization maximization algorithm and gives guarantees on its convergence.
An Asymptotically Optimal Multi-Armed Bandit Algorithm and Hyperparameter Optimization
TLDR
This paper proposes an efficient and robust bandit-based algorithm called Sub-Sampling (SS), which evaluates the potential of hyperparameters by the sub-samples of observations and is theoretically proved to be optimal under the criterion of cumulative regret.
Multiple Optimality Guarantees in Statistical Learning
TLDR
This thesis considers the fundamental questions that arise when trading between multiple such criteria--computation, communication, privacy--while maintaining statistical performance, and proves fundamental lower bounds on the statistical performance of any algorithm subject to the constraints ofcomputational, confidentiality, or communication.
Computation-Risk Tradeoffs for Covariance-Thresholded Regression
TLDR
This analysis shows how the risk of this family of linear regression estimators varies with the sparsity and regularization level, thus establishing a statistical estimation setting for which there is an explicit, smooth tradeoff between risk and computation.
Computational and Statistical Tradeoffs in Learning to Rank
TLDR
In the application of learning to rank, this work provides a hierarchy of rank-breaking mechanisms ordered by the complexity in thus generated sketch of the data that allows the number of data points collected to be gracefully traded off against computational resources available, while guaranteeing the desired level of accuracy.
Computational and statistical tradeoffs via convex relaxation
TLDR
This paper defines a notion of “algorithmic weakening,” in which a hierarchy of algorithms is ordered by both computational efficiency and statistical efficiency, allowing the growing strength of the data at scale to be traded off against the need for sophisticated processing.
High-dimensional change-point estimation: Combining filtering with convex optimization
TLDR
The main result of this paper shows that the method performs change-point estimation reliably as long as the product of the smallest-sized change and the smallest distance between change-points is larger than a Gaussian width parameter that characterizes the low-dimensional complexity of the underlying signal sequence.
Automating model search for large scale machine learning
TLDR
An architecture for automatic machine learning at scale comprised of a cost-based cluster resource allocation estimator, advanced hyper-parameter tuning techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching and optimal resource allocation is proposed.
...
...

References

SHOWING 1-10 OF 36 REFERENCES
FAST RATES FOR ESTIMATION ERROR AND ORACLE INEQUALITIES FOR MODEL SELECTION
TLDR
It is shown that, provided the sequence of models is ordered by inclusion, in these cases the authors can use tight upper bounds on estimation error as a complexity penalty, even in situations when the difference between the empirical risk and true risk decreases much more slowly than the complexity penalty.
Risk bounds for model selection via penalization
TLDR
It is shown that the quadratic risk of the minimum penalized empirical contrast estimator is bounded by an index of the accuracy of the sieve, which quantifies the trade-off among the candidate models between the approximation error and parameter dimension relative to sample size.
Model Selection and Error Estimation
TLDR
A tight relationship between error estimation and data-based complexity penalization is pointed out: any good error estimate may be converted into a data- based penalty function and the performance of the estimate is governed by the quality of the error estimate.
Complexity regularization via localized random penalties
TLDR
This article proposes a new complexity-penalized model selection method based on data-dependent penalties, and considers the binary classification problem where, given a random observation X ∈ R d, one has to predict Y ∈ {0,1}.
Complexity Regularization with Application to Artificial Neural Networks
TLDR
This paper defines general complexity regularization criteria and establishes bounds on the statistical risk of the estimated functions and establishes consistency, yield rates of convergence, and the near asymptotic optimality of the model selection criterion in both parametric and nonparametric cases.
Empirical minimization
We investigate the behavior of the empirical minimization algorithm using various methods. We first analyze it by comparing the empirical, random, structure and the original one on the class, either
Robust Stochastic Approximation Approach to Stochastic Programming
TLDR
It is intended to demonstrate that a properly modified SA approach can be competitive and even significantly outperform the SAA method for a certain class of convex stochastic problems.
Convexity, Classification, and Risk Bounds
TLDR
A general quantitative relationship between the risk as assessed using the 0–1 loss and the riskAs assessed using any nonnegative surrogate loss function is provided, and it is shown that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function.
Rademacher and Gaussian Complexities: Risk Bounds and Structural Results
TLDR
This work investigates the use of certain data-dependent estimates of the complexity of a function class called Rademacher and Gaussian complexities and proves general risk bounds in terms of these complexities in a decision theoretic setting.
On the generalization ability of on-line learning algorithms
TLDR
This paper proves tight data-dependent bounds for the risk of this hypothesis in terms of an easily computable statistic M/sub n/ associated with the on-line performance of the ensemble, and obtains risk tail bounds for kernel perceptron algorithms interms of the spectrum of the empirical kernel matrix.
...
...