heal.abstract |
A generalization of Wilks's single-outlier test suitable for application to the many-outlier problem of detecting from 1 to k outliers in a multivariate data set is proposed and appropriate critical values determined. The method used follows that suggested by Rosner employing sequential application of the generalized extreme Studentized deviate to univariate samples of reducing size, in which the type I error is controlled both under the hypothesis of no outliers and under the alternative hypothesis of 1, 2,..., k outliers. It is shown that critical values for the sequential application of Wilks's test to detect many outliers depend only on those for a single outlier test which may be approximated by percentage points from the F-distributions as tabulated by Wilks. Relationships between Wilks's test statistic, the Mahalanobis distance between the 'outlier' and the mean vector, and Hotelling's T2-test between the outlier and the rest of the data, are used to reduce the amount of computation involved in applying the sequential procedure. Simulations are used to show that the method behaves well in detecting multiple outliers in samples larger than about 25. Finally, an example with three dimensions is used to illustrate how the method is applied. |
en |