×
files/journal/2022-09-01_23-34-07-000000_997.jpg

Journal of Modern Mathematics and Statistics

ISSN: Online
ISSN: Print 1994-5388
155
Views
1
Downloads

Empirical Comparison of the Kruskal Wallis Statistics and its Parametric Counterpart

Samuel O. Adams, Ezra Gayawan and Mohammed K. Garba
Page: 38-42 | Received 21 Sep 2022, Published online: 21 Sep 2022

Full Text Reference XML File PDF File

Abstract

The nonparametric Kruskal-Wallis statistics (H-test) and its parametric counterpart, the one way ANOVA (F-test) are two powerful statistics commonly used to compare k (>2) sample. The Monte Carlo approached was used in this study to compare the performances of the two statistics especially when the assumptions of normality and homogeneity of variance are violated. Data were generated from normal, exponential and Poisson distributions. It was discovered that when the samples are from normal and Poisson distributions, the F-test is more powerful than the H-test but the reverse is the case when they are from exponential distribution. However, when the sample size increases to say ≥15, the two statistics perform equally well in terms of their power irrespective of the distribution from which the samples are drawn.


INTRODUCTION

A common problem in many areas of the application of practical statistics is that of deciding whether several samples are to be regarded as coming from the same population. Often, the sample differ and the question is weather such difference is as a result of differences among the populations or that due to chance expected among random variables. Two statistical methods of dealing with this problem of comparing means of k (>2) populations are the parametric one-way analysis of variance (F-test) and the non-parametric H-test proposed by Kruskal and Wallis (1952) and is therefore, commonly referred to as the Kruskal and Wallis (1952) statistics.

The classical F-test is more stringent in the use of assumptions than the H-test. Its most stringent assumption is that it requires all samples to be drawn from a normal population. Also, the measurement scale should be at least interval. Other assumptions are that all samples must be mutually independent and the homoscedastic of variance (Daniel, 1990). The H-test does not require the samples to come from normal population and the measurement scale can be at least ordinal. In many experimental situations, the normality assumption is not very realistic and sometimes, the samples may not be continuous though this is commonly assumed and the classical approach of comparing the samples is often used (Laan and Verdooren, 1987). This undoubtedly, may yield a misleading result.

Many researchers have tried to look at the performances of these two statistics in terms of their strength and limitations. Laan and Verdooren (1987) are of the view that where the experimental is not sure whether the normality assumption is realistic for the classical F-test, a non-parametric test should be used as according to them, it is possible that the non-parametric test has a larger power than its classical counterpart in this situation. Rust and Fligner (1984) proposed a modification to the H-test in comparing several samples and this they contended requires fewer assumptions about the shape of the population. Other researchers, who have commented on the classical F-test and its non-parametric counterpart, include Conover (1999), Hettmansperger and McKean (1978), Puri and Sen (1971), Randles and Wolfe (1979), Buringer et al. (1980) and Hollander and Wolfe (1999).

In this study, we investigate the robustness of the classical F-test over the non-parametric H-test when the assumptions of normality and homogeneity of variance are violated under three different scenarios using the Monte Carlo approach. The power function for each scenario was computed at the significance level (α) of 0.01, 0.05 and 0.1.

TEST STATISTICS FOR COMPARING K (>2) POPULATIONS
The one way ANOVA (F-test):
Let a random sample of size ni be drawn from population i (i = 1, 2, …, k). The resulting random variables Xij (i = 1, 2, …, k; j = 1, 2, …, ni) are assumed to be normally distributed N (μi, σ2). The linear model of the form Xij = μi + Eij is usually used with independent Eij, which are N (0, σ2) (Laan and Verdooren, 1987; Oyejola, 2003). The observations can thus be classified according to a classification A with classes A1, A2, …, Ak, where the ni observations Xij (i = 1, 2, …, k) belong to class Ai. For testing the null hypothesis H0: μ1 = μ2 = ... = μk against its alternate H1: at least a pair of μ’s is unequal, the test statistics F = MSA/MSE is used, where MSA = SSA/(k-1) and SSA, the sum of squares of A equals:

and

and SSE, the sum of square error, equals

Under H0, F∼F (a, e) with α = k-1 and

degrees of freedom.

The Kruskal-Wallis test (H-test): Let a random sample of size ni be drawn from population i (i = 1, 2, …, k). The resulting random variables Xij (i = 1, 2, …, k; j = 1, 2, …, ni) are assumed to come from populations with continuous distribution function Fi. The hypothesis to test is of the form H0: Fi (x) = F2 (x) = Fk(x). In applying the H-test, all the

observations will be ranked in increasing order of magnitude with ranks 1, 2, …, N. The H statistic is

where, Rij is the rank of Xij (i = 1, 2, …, k; j = 1, 2, …, ni) and with the usual notation

,

etc. Details of the critical values of H for few samples with each having <5 observations can be found by Daniel (1990) and Iman et al. (1975). When ni→4 H has asymptotically a Chi-square (χ2) distribution with k-1 degrees of freedom (χ2 (k -1)).

Generation of random sample: The random samples used for the aim of this study were generated from the normal, exponential and the Poisson distributions using the SPSS package. Four independent samples were generated from the three distributions to make up three scenarios for the study. Sample size of 2, 5, 10, 15, 20, 25 and 30 were used for the three scenarios expect that of normal where we included sample size 8. Each sample size was replicated 100 to allow us compute the power function.

In the first Scenario, the normality and homogeneity of variance assumptions were withheld. The means used in generating the four independent samples were 10.0, 13.0, 15.0 and 18.0, respectively with a common standard deviation of 5. For the second scenario, where the exponential distribution was used, the scale parameters were taken to be 0.2, 0.5, 2.0 and 5.0, respectively. By this, both the normality and homogeneity of variance assumptions were violated. In the third scenario, the means for the Poisson distribution were also taken to be respectively 10.0, 13.0, 15.0 and 18.0. With this, not only that the two assumptions above were violated, the variables so generated were not continuous.

RESULTS AND DISCUSSION

The classical F statistic and its non-parametric H counterpart were applied to all the data generated in order to determine if the samples can be said to have come from the same populations. The number of rejections out of 100 replications for the three scenarios at the level of significance (α) of 0.01, 0.05 and 0.1 are shown in Table 1-3 while, the power functions computed for each of the scenarios are presented in Fig. 1-9.

The results in Table 1 indicate that under the normal distribution, the F-test rejects more often than the H-test and as the level of significance of the test increases, the number of rejections increases consistently for the two tests. For n = 15, the two tests perform equally in rejecting the null hypothesis except for when α = 0.01. For the results of the exponential distribution in Table 2, the F-test does better than the H in its number of rejection when n = 2. As n = 5, the H-test becomes more powerful though the two test soon become equally powerful as n = 10. This is consistent for all the α levels used.

 

Table 1: Number of rejections out of 100 replications under the normal distribution

 

 

Table 2: Number of rejections out of 100 replications under the exponential distribution

 

 

Table 3: Number of rejections out of 100 replications under the Poisson distribution

 

 

Fig. 1: Power curve at α = 0.01 using normal distribution

 

In the case of the Poisson distribution, all through the levels of significance, the F-test is more powerful than the H-test. But as n = 10, the two test statistics can be said to have equal power.

 

Fig. 2: Power curve at α = 0.05 using normal distribution

 

 

Fig. 3: Power curve at α = 0.1 using normal distribution

 

 

Fig. 4: Power curve at α = 0.01 using exponential distribution

 

 

Fig. 5: Power curve at α = 0.05 using exponential distribution

 

 

Fig. 6: Power curve at α = 0.1 using exponential distribution

 

 

Fig. 7: Power curve at α = 0.01 using poisson distribution

 

 

Fig. 8: Power curve at α = 0.05 using poisson distribution

 

 

Fig. 9: Power curve at α = 0.1 using POISSON distribution

 

The power curves in the Fig. 1-9 indicate that the parametric F-test is more powerful that the non-parametric H-test when n = 10. However, in the case of the exponential distribution in Fig. 4-6, the non parametric H-test is more powerful at all the levels of significance considered.

CONCLUSION

The Monte Carlo study has revealed that when there are a number of samples to be compared whether, they all come from the same population, the parametric F-test should be used when it is known that the samples are from normal distribution. When they are known to have come from the Poisson distribution, which is of the discrete type and the experimental wants to use either of the tests for comparison, the F-test should be preferred. However, when the samples are continuous but the experimental is not sure whether the assumption of normality and homogeneity of variance are valid, the non-parametric H-test is to be considered for use. But where the sample size is not small say = 15, any of the two test will perform well for the purpose of such comparison irrespective of the distribution.

How to cite this article:

Samuel O. Adams, Ezra Gayawan and Mohammed K. Garba. Empirical Comparison of the Kruskal Wallis Statistics and its Parametric Counterpart.
DOI: https://doi.org/10.36478/jmmstat.2009.38.42
URL: https://www.makhillpublications.co/view-article/1994-5388/jmmstat.2009.38.42