The Minimum Regularized Covariance Determinant Estimator

  title={The Minimum Regularized Covariance Determinant Estimator},
  author={Kris Boudt and Peter Rousseeuw and Steven Vanduffel and Tim Verdonck},
  journal={Big Data \& Innovative Financial Technologies Research Paper Series},
The Minimum Covariance Determinant (MCD) approach robustly estimates the location and scatter matrix using the subset of given size with lowest sample covariance determinant. Its main drawback is that it cannot be applied when the dimension exceeds the subset size. We propose the Minimum Regularized Covariance Determinant (MRCD) approach, which differs from the MCD in that the subset-based covariance matrix is a convex combination of a target matrix and the sample covariance matrix. A data… Expand
Minimum Covariance Determinant and Extensions
The Minimum Covariance Determinant method is reviewed, along with its main properties such as affine equivariance, breakdown value, and influence function, and two recent extensions of the MCD are described. Expand
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix
This paper considers a total of eight covariance matrix estimation methods for comparison and provides practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance Matrix. Expand
Generalization of the minimum covariance determinant algorithm for categorical and mixed data types
A generalized MCD is defined and illustrated on data from two large scale projects: the Ontario Neurodegenerative Disease Research Initiative and the Alzheimer’s Disease Neuroimaging Initiative, with genetics, clinical instruments and surveys (categorical or ordinal), and neuroimaging (continuous) data. Expand
A Comparison of Estimation Techniques for the Covariance Matrix in a Fixed-Income Framework
We compare various methodologies to estimate the covariance matrix in a fixed-income portfolio. Adopting a statistical approach for the robust estimation of the covariance matrix, we compared theExpand
An Efficient Estimation and Classification Methods for High Dimensional Data Using Robust Iteratively Reweighted SIMPLS Algorithm Based on nu-Support Vector Regression
The robust iteratively reweighted SIMPLS based onnu-Support Vector Regression, denoted as SVR-RWSIMPLS is proposed to classify observations into regular observations, vertical outliers, good (GLPs) and bad leverage points (BLPs). Expand
Outlyingness: Which variables contribute most?
A fast and efficient method is proposed to detect variables that contribute most to an outlier’s outlyingness, and it is shown that the problem of estimating that direction can be rewritten as the normed solution of a classical least squares regression problem. Expand
Testing equality of standardized generalized variances of k multivariate normal populations with arbitrary dimensions
For a p-variate normal distribution with covariance matrix $$ {\varvec{\Sigma }}$$Σ, the standardized generalized variance (SGV) is defined as the positive pth root of $$ |{\varvec{\Sigma }}| $$|Σ|Expand
The power of monitoring: how to make the most of a contaminated multivariate sample
The findings support the claim that the principle of monitoring is very flexible and that it can lead to robust estimators that are as efficient as possible and address some of the tricky inferential issues that arise from monitoring. Expand
Extension of Maximum Autocorrelation Factorization: With application to imaging mass spectrometry data
The goal of this thesis is to build upon the MAF algorithm and remove the current need for user input, making an extension of MAF that is fully unsupervised and produces factors ranked according to spatial autocorrelation. Expand
Anomaly detection by robust statistics
An overview of several robust methods and the resulting graphical outlier detection tools for univariate, low‐dimensional, and high‐dimensional data, such as estimating location and scatter, linear regression, principal component analysis, classification, clustering, and functional data analysis are presented. Expand


A Fast Algorithm for the Minimum Covariance Determinant Estimator
A new algorithm for the minimum covariance determinant (MCD) method, called FAST-MCD, which makes the MCD method available as a routine tool for analyzing multivariate data and proposes the distance-distance plot, which displays MCD-based robust distances versus Mahalanobis distances. Expand
Condition Number Regularized Covariance Estimation.
This paper proposes a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator, and investigates the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and develops an approach that adaptively determines the level of regularization that is required. Expand
A well-conditioned estimator for large-dimensional covariance matrices
Many applied problems require a covariance matrix estimator that is not only invertible, but also well-conditioned (that is, inverting it does not amplify estimation error). For large-dimensionalExpand
Robust High-Dimensional Precision Matrix Estimation
The dependency structure of multivariate data can be analyzed using the covariance matrix \(\boldsymbol{\varSigma }\). In many fields the precision matrix \(\boldsymbol{\varSigma }^{-1}\) is evenExpand
High dimensional covariance matrix estimation using a factor model
High dimensionality comparable to sample size is common in many statistical problems. We examine covariance matrix estimation in the asymptotic framework that the dimensionality p tends to [infinity]Expand
Influence Function and Efficiency of the Minimum Covariance Determinant Scatter Matrix Estimator
The minimum covariance determinant (MCD) scatter estimator is a highly robust estimator for the dispersion matrix of a multivariate, elliptically symmetric distribution. It is relatively fast toExpand
The Distribution of Robust Distances
Mahalanobis-type distances in which the shape matrix is derived from a consistent, high-breakdown robust multivariate location and scale estimator have an asymptotic chi-squared distribution as isExpand
Robust Multivariate Regression
It is shown that the multivariate regression estimator has the appropriate equivariance properties, has a bounded influence function, and inherits the breakdown value of the MCD estimator, which confirms the good finite-sample results obtained from the simulations. Expand
Fast and robust discriminant analysis
Robust discriminant rules are obtained by inserting robust estimates of location and scatter into generalized maximum likelihood rules at normal distributions and the highly robust MCD estimator is used as it can be computed very fast for large data sets. Expand
Outlier detection in the multiple cluster setting using the minimum covariance determinant estimator
Malanobis-type distances in which the shape matrix is derived from a consistent high-breakdown robust multivariate location and scale estimator can be used to find outlying points in a robust clustering method in conjunction with an outlier identification method. Expand