Learn More
In this paper we introduce a new method for robust principal component analysis. Classical PCA is based on the empirical covariance matrix of the data and hence it is highly sensitive to outlying observations. In the past, two robust approaches have been developed. The first is based on the eigenvectors of a robust scatter matrix such as the MCD or an(More)
When analyzing data, outlying observations cause problems because they may strongly influence the result. Robust statistics aims at detecting the outliers by searching for the model fitted by the majority of the data. We present an overview of several robust methods and outlier detection tools. We discuss robust procedures for univariate, low-dimensional,(More)
  • Mia Hubert, Mia Hubert@wis Kuleuven Be, Johan A K Suykens, Johan Kuleuven Suykens@esat, Be
Recent results about the robustness of kernel methods involve the analysis of influence functions. By definition the influence function is closely related to leave-one-out criteria. In statistical learning , the latter is often used to assess the generalization of a method. In statistics, the influence function is used in a similar way to analyze the(More)
A collection of n hyperplanes in R d forms a hyperplane arrangement. The depth of a point 2 R d is the smallest number of hyperplanes crossed by any ray emanating from. For d = 2 we prove that there always exists a point with depth at least dn=3e. For higher dimensions we conjecture that the maximal depth is at least dn=(d + 1)e. For arrangements in general(More)
In extreme value statistics, the extreme value index is a well-known parameter to measure the tail heaviness of a distribution. Pareto-type distributions, with strictly positive extreme value index (or tail index) are considered. The most prominent extreme value methods are constructed on efficient maximum likelihood estima-tors based on specific parametric(More)
The goal of discriminant analysis is to obtain rules that describe the separation between groups of observations. Moreover it allows to classify new observations into one of the known groups. In the classical approach discriminant rules are often based on the empirical mean and covariance matrix of the data, or of parts of the data. But because these(More)
The outlier sensitivity of classical principal component analysis (PCA) has spurred the development of robust techniques. Existing robust PCA methods like ROBPCA work best if the non-outlying data have an approximately symmetric distribution. When the original variables are skewed, too many points tend to be flagged as outlying. A robust PCA method is(More)
Cross-validation (CV) is a very popular technique for model selection and model validation. The general procedure of leave-one-out CV is to exclude one observation from the data set, to construct the fit of the remaining observations and to evaluate that fit on the item that was left out. In classical procedures such as least-squares regression or kernel(More)
The kurtosis coefficient is often regarded as a measure of the tail heaviness of a distribution relative to that of the normal distribution. However, it also measures the peakedness of a distribution, hence there is no agreement on what kurtosis really estimates. Another disadvantage of the kurtosis is that its interpretation and consequently its use is(More)