#### Filter Results:

#### Publication Year

1999

2016

#### Publication Type

#### Co-author

#### Publication Venue

#### Key Phrases

Learn More

In this paper we address the problem of identifying differences between populations of trees. Besides the theoretical relevance of this problem, we are interested in testing if trees characterizing protein sequences from different families constitute samples of significantly different distributions. In this context, trees are obtained by modelling protein… (More)

Given k independent samples of functional data the problem of testing the null hypothesis of equality of their respective mean functions is considered. So the setting is quite similar to that of the classical one-way anova model but the k samples under study consist of functional data. A simple natural test for this problem is proposed. It can be seen as an… (More)

- ANTONIO CUEVAS, MANUEL FEBRERO, RICARDO FRAIMAN
- 1999

Hartigan (1975) defines the number q of clusters in a d-variate statistical population as the number of connected components of the set {f > c}, where f denotes the underlying density function on IR d and c is a given constant. Some usual cluster algorithms treat q as an input which must be given in advance. The authors propose a method for estimating this… (More)

In this paper we introduce two procedures for variable selection in cluster analysis and classification rules. One is mainly oriented to detect the " noisy " non–informative variables, while the other deals also with multicolinearity. A forward–backward algorithm is also proposed to make feasible these procedures in large data sets. A small simulation is… (More)

- Antonio Cuevas, Ricardo Fraiman, Alberto Rodŕıguez-Casal, A. RODRÍGUEZ-CASAL
- 2007

The Minkowski content L0(G) of a body G ⊂ R d represents the boundary length (for d = 2) or the surface area (for d = 3) of G. A method for estimating L0(G) is proposed. It relies on a nonparamet-ric estimator based on the information provided by a random sample (taken on a rectangle containing G) in which we are able to identify whether every point is… (More)

The possibility of considering random projections to identify probability distributions belonging to parametric families is explored. The results are based on considerations involving invariance properties of the family of distributions as well as on the random way of choosing the projections. In particular, it is shown that if a one-dimensional (suitably)… (More)

The Spanish Antarctic Station Juan Carlos I has been registering surface air temperatures with the frequency of one reading per 10 min since the austral summer 1987–1988. Although this data set contains valuable information about the climate patterns in and around Antarctica, it has not been utilized in any existing climate studies thus far because of the… (More)

In this paper we extend the notion of impartial trimming to a functional data framework, and we obtain resistant estimates of the center of a functional distribution. We give mild conditions for the existence and uniqueness of the functional trimmed means. We show the continuity of the population parameter with respect to the weak convergence of probability… (More)

In this paper we provide conditions under which a distribution is determined by just one randomly chosen projection. Then we apply our results to construct goodness-of-fit tests for the one and two-sample problems. We include some simulations as well as the application of our results to a real data set. Our results are valid for every separable Hilbert… (More)

Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as… (More)