Accurate Approximation to the Extreme Order Statistics of Gaussian Samples


Evaluation of the integral properties of Gaussian Statistics is problematic because the Gaussian function is not analytically integrable. We show that the expected value of the greatest order statistics in Gaussian samples (the max distribution) can be accurately approximated by the expression 0"(0.5264'/"), where n is the sample size and @' is the inverse of the Gaussian cumulative distribution function. The expected value of the least order statistics in Gaussian samples (the min distribution) is correspondingly approximated by -0-'(0.5264"n). The standard deviation of both extreme order distributions can be approximated by the expression 0.5[cp"(0.8832'/") 0"(0.2142'/~]. We also show that the probability density function of the extreme order distribution can be well approximated by gamma distributions with appropriate parameters. These approximations are accurate, computationally efficient, and readily implemented by build-in functions in many commercial mathematical software packages such as MATLAB, Mathematics, and Excel. Copyright O 1999 by Marcel Dekker, Inc. INTRODUCTION CHEN AND TYLER Consider n samples x,, x,, ..., x, from a standard Gaussian distribution, N(0,l). The extreme order distributions are the distributions of the greatest and the least values among n samples from the Gaussian distribution. Let x, = max(x,), 1 i = 1,2, ..., n be the greatest of the n sample values. The probability distribution of x, has the density function PDF(x,) = n @(x,)'"-I) $(x,,J (1) where $(x) is the probability distribution function (PDF) and @(x) is the cumulative distribution function (CDF) of the standard Gaussian distribution (Bain & Engelhardt, 1987). The greatest order distribution PDFs of selected sample sizes are shown in Figure 1. For the least of the n samples = min(xi), i = 1,2 ,..., n, has the i probability density distribution n @(-x~,)'"-~' @(-x&, J. ( 2 ) Extreme order distributions are widely used in fields such as biology, psychophysics, economics, seismology, signal processing and analysis of parallel distributed noisy systems. It is particularly relevant in the analysis of stochastic resonance phenomena, where the addition of noise can increase detectability of a signal derived from a nonlinear system (Bulsara et al., 1991; Bezrukov & Vodyanoy, 1997). However, since the Gaussian CDF @(x) cannot be expressed in terms of elementary functions, it is difficult to integrate @(x) analytically, and thus analyuc solutions to the moments of the extreme order distribution are difficult to find. Statisticians have made efforts to find analytical solutions to expected value and standard deviation of the extreme order distribution with a recurrence method (Jones, 1948; Ruben, 1954; Bose & Gupta, 1959; David, 1963). Although this method is successful for small sample sizes, it is tedious and fails for sample size n >= 6 (Arnold & Balakrishnan, 1989; Harter & Balaknshnan, 1996), which makes it of limited utility. The expected value and standard deviation of the extreme order distributions of Gaussian samples have been tabled for selective sample sizes by numerical EXTREME ORDER STATISTICS OF GAUSSIAN SAMPLES 179 Distribution for the max of n Gaussian samples 2 , I FIG. Gaussian deviates 1. Probability density finctions for the greatest order values of Gaussian samples with sample sizes n from 1 to l,aX),000 in decade steps. integration (Harter, 1961; Parrish, 1992a, b). Those tables are not very practical because they only list selected sample sizes and thus we still have no access to the expected value and the standard deviation of the extreme order distribution of arbitrary sample sizes. Moreover, the accuracy of numerical integration depends on the range of the independent variable and size of the bin used for integration. The wider the range and the smaller the bin size, the more accurate is the integration. To increase the range and decrease the bin size correspondingly increases the computation time. Thus, accurate numerical integration is quite time consuming. Blom (1958) suggested an expression to approximate the expected value En of the greatest order distribution ( m a ) numerically: i a B, = W1( n 2 a + l ) However, the constant a changes continuously with the sample size n. Moreover, there is no simple relation between a and n. Thus, his method fails to compute the expected value of the max distribution with any arbitrary sample size.

Cite this paper

@inproceedings{Chen2006AccurateAT, title={Accurate Approximation to the Extreme Order Statistics of Gaussian Samples}, author={Chien-Chung Chen and Christopher W. Tyler}, year={2006} }