- Published 2013 in Computational Statistics & Data Analysis

We consider a k-sample problem, k > 2, where samples have been obtained from k (random) generators, and we are interested in identifying those samples, if any, that exhibit substantial deviations from a pattern given by most of the samples. This main pattern would consist of component samples which should exhibit some internal degree of similarity. To handle similarity, can be of interest in a variety of situations. As an example, imagine a nation-wide evaluation test in which several markers evaluate exams coming from all the country. The interest focuses on analyzing if there are markers whose grades exhibit significant deviations from a generalized pattern. A null hypothesis of homogeneity is too strong to be considered as a realistic one because of the differences in the backgrounds of the involved students and similarity seems more appropriate. To detect deviations we need to use some pattern as a reference, that in our setup is a hidden pattern. In this paper we develop a statistical procedure designed to search for a main pattern, detecting the samples that are significantly less similar with respect to (a pooled version of) the others. This is done through a probability metric, a bootstrap approach and a stepwise search algorithm. Moreover, the procedure also allows to identify which part of each sample makes it different of the others.

@article{lvarezEsteban2013SearchingFA,
title={Searching for a common pooling pattern among several samples},
author={Pedro C. {\'A}lvarez-Esteban and Eustasio del Barrio and Juan Antonio Cuesta-Albertos and Carlos Matr{\'a}n},
journal={Computational Statistics & Data Analysis},
year={2013},
volume={67},
pages={1-14}
}