Noureddine El Karoui

Learn More
Estimating covariance matrices is a problem of fundamental importance in multivariate statistics. In practice it is increasingly frequent to work with data matrices X of dimension n × p, where p and n are both large. Results from random matrix theory show very clearly that in this setting, standard estimators like the sample covariance matrix perform in(More)
We study regression M-estimates in the setting where p, the number of covariates, and n, the number of observations, are both large, but p ≤ n. We find an exact stochastic representation for the distribution of β = argmin(β∈ℝ(p)) Σ(i=1)(n) ρ(Y(i) - X(i')β) at fixed p and n under various assumptions on the objective function ρ and our statistical model. A(More)
We place ourselves in the setting of high-dimensional statistical inference, where the number of variables p in a dataset of interest is of the same order of magnitude as the number of observations n. We consider the spectrum of certain kernel random matrices, in particular n × n matrices whose (i, j)-th entry is f (X i X j /p) or f (X i − X j 2 /p), where(More)
We study the properties of solutions of quadratic programs with linear equality constraints whose parameters are estimated from data in the high-dimensional setting where p, the number of variables in the problem, is of the same order of magnitude as n, the number of observations used to estimate the parameters. The Markowitz problem in Finance is a subcase(More)
We use a rank one Gaussian perturbation to derive a smooth stochastic approximation of the maximum eigenvalue function. We then combine this smoothing result with an optimal smooth stochastic optimization algorithm to produce an efficient method for solving maximum eigenvalue minimization problems. We show that the complexity of this new method is lower(More)
We study the realized risk of Markowitz portfolio computed using parameters estimated from data and generalizations to similar questions involving the out-of-sample risk in quadratic programs with linear equality constraints. We do so under the assumption that the data is generated according to an elliptical model, which allows us to study models where we(More)
We consider, for the first time in the modern setting of high-dimensional statistics, the classic problem of optimizing the objective function in regression. We propose an algorithm to compute this optimal objective function that takes into account the dimensionality of the problem. deviations I n this article we study a fundamental statistical problem: how(More)
Regularization is a technique widely used to improve the stability of solutions to statistical problems. We propose a new regularization concept, performance-based regularization (PBR), for data-driven stochastic optimization. The goal is to improve upon Sample Average Approximation (SAA) in finite-sample performance while maintaining minimal assumptions(More)