Remove ads
Non-parametric method for testing whether samples originate from the same distribution From Wikipedia, the free encyclopedia
The Kruskal–Wallis test by ranks, Kruskal–Wallis test (named after William Kruskal and W. Allen Wallis), or one-way ANOVA on ranks is a non-parametric statistical test for testing whether samples originate from the same distribution.[1][2][3] It is used for comparing two or more independent samples of equal or different sample sizes. It extends the Mann–Whitney U test, which is used for comparing only two groups. The parametric equivalent of the Kruskal–Wallis test is the one-way analysis of variance (ANOVA).
A significant Kruskal–Wallis test indicates that at least one sample stochastically dominates one other sample. The test does not identify where this stochastic dominance occurs or for how many pairs of groups stochastic dominance obtains. For analyzing the specific sample pairs for stochastic dominance, Dunn's test,[4] pairwise Mann–Whitney tests with Bonferroni correction,[5] or the more powerful but less well known Conover–Iman test[5] are sometimes used.
It is supposed that the treatments significantly affect the response level and then there is an order among the treatments: one tends to give the lowest response, another gives the next lowest response is second, and so forth.[6] Since it is a nonparametric method, the Kruskal–Wallis test does not assume a normal distribution of the residuals, unlike the analogous one-way analysis of variance. If the researcher can make the assumptions of an identically shaped and scaled distribution for all groups, except for any difference in medians, then the null hypothesis is that the medians of all groups are equal, and the alternative hypothesis is that at least one population median of one group is different from the population median of at least one other group. Otherwise, it is impossible to say, whether the rejection of the null hypothesis comes from the shift in locations or group dispersions. This is the same issue that happens also with the Mann-Whitney test.[7][8][9] If the data contains potential outliers, if the population distributions have heavy tails, or if the population distributions are significantly skewed, the Kruskal-Wallis test is more powerful at detecting differences among treatments than ANOVA F-test. On the other hand, if the population distributions are normal or are light-tailed and symmetric, then ANOVA F-test will generally have greater power which is the probability of rejecting the null hypothesis when it indeed should be rejected.[10][11]
A large amount of computing resources is required to compute exact probabilities for the Kruskal–Wallis test. Existing software only provides exact probabilities for sample sizes of less than about 30 participants. These software programs rely on the asymptotic approximation for larger sample sizes. Exact probability values for larger sample sizes are available. Spurrier (2003) published exact probability tables for samples as large as 45 participants.[14] Meyer and Seaman (2006) produced exact probability distributions for samples as large as 105 participants.[15]
Choi et al.[16] made a review of two methods that had been developed to compute the exact distribution of , proposed a new one, and compared the exact distribution to its chi-squared approximation.
The following example uses data from Chambers et al.[17]
on daily readings of ozone for May 1 to September 30, 1973, in New York City. The data are in the R data set airquality
, and the analysis is included in the documentation for the R function kruskal.test
. Boxplots of ozone values by month are shown in the figure.
The Kruskal-Wallis test finds a significant difference (p = 6.901e-06) indicating that ozone differs among the 5 months.
kruskal.test(Ozone ~ Month, data = airquality)
Kruskal-Wallis rank sum test
data: Ozone by Month
Kruskal-Wallis chi-squared = 29.267, df = 4, p-value = 6.901e-06
To determine which months differ, post-hoc tests may be performed using a Wilcoxon test for each pair of months, with a Bonferroni (or other) correction for multiple hypothesis testing.
pairwise.wilcox.test(airquality$Ozone, airquality$Month, p.adjust.method = "bonferroni")
Pairwise comparisons using Wilcoxon rank sum test
data: airquality$Ozone and airquality$Month
5 6 7 8
6 1.0000 - - -
7 0.0003 0.1414 - -
8 0.0012 0.2591 1.0000 -
9 1.0000 1.0000 0.0074 0.0325
P value adjustment method: bonferroni
The post-hoc tests indicate that, after Bonferroni correction for multiple testing, the following differences are significant (adjusted p < 0.05).
The Kruskal-Wallis test can be implemented in many programming tools and languages. We list here only the open source free software packages:
scipy.stats.kruskal
can return the test result and p-value.[18]kruskal.test
.[19]HypothesisTests.jl
has the function KruskalWallisTest(groups::AbstractVector{<:Real}...)
to compute the p-value.[21]Seamless Wikipedia browsing. On steroids.
Every time you click a link to Wikipedia, Wiktionary or Wikiquote in your browser's search results, it will show the modern Wikiwand interface.
Wikiwand extension is a five stars, simple, with minimum permission required to keep your browsing private, safe and transparent.