Learn More
We introduce a new method for performing clustering with the aim of fitting clusters with different scatters and weights. It is designed by allowing to handle a proportion α of contaminating data to guarantee the robustness of the method. As a characteristic feature , restrictions on the ratio between the maximum and the minimum eigenvalues of the groups(More)
The possibility of considering random projections to identify probability distributions belonging to parametric families is explored. The results are based on considerations involving invariance properties of the family of distributions as well as on the random way of choosing the projections. In particular, it is shown that if a one-dimensional (suitably)(More)
Two key questions in Clustering problems are how to determine the number of groups properly and measure the strength of group-assignments. These questions are specially involved when the presence of certain fraction of outlying data is also expected. Any answer to these two key questions should depend on the assumed probabilistic-model, the allowed group(More)
Robust estimators of location and dispersion are often used in the elliptical model to obtain an uncontaminated and highly representative subsample by trimming the data outside an ellipsoid based in the associated Mahalanobis distance. Here we analyze some one (or k)-step Maximum Likelihood Estimators computed on a subsample obtained with such a procedure.(More)
The maximum likelihood estimation in the finite mixture of distributions setting is an ill-posed problem that is treatable, in practice, through the EM algorithm. However, the existence of spurious solutions (singularities and non-interesting local maximizers) makes difficult to find sensible mixture fits for non-expert practitioners. In this work, a(More)
The use of trimming procedures constitutes a natural approach to robustifying statistical methods. This is the case of goodness-of-fit tests based on a distance, which can be modified by choosing trimmed versions of the distributions minimizing that distance. In this paper we consider the L 2-Wasserstein distance and introduce the trimming methodology for(More)
A new method for performing robust clustering is proposed. The method is designed with the aim of fitting clusters with different scatters and weights. A proportion α of contaminating data points is also allowed. Restrictions on the ratio between the maximum and the minimum eigenvalues of the groups scatter matrices are introduced. These restrictions make(More)
We consider a k-sample problem, k > 2, where samples have been obtained from k (random) generators, and we are interested in identifying those samples, if any, that exhibit substantial deviations from a pattern given by most of the samples. This main pattern would consist of component samples which should exhibit some internal degree of similarity. To(More)