TY - JOUR
T1 - Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity
T2 - Evidence of the problem and valid alternatives
AU - Kroeger, Cynthia M.
AU - Ejima, Keisuke
AU - Hannon, Bridget A.
AU - Halliday, Tanya M.
AU - McComb, Bryan
AU - Teran-Garcia, Margarita
AU - Dawson, John A.
AU - King, David B.
AU - Brown, Andrew W.
AU - Allison, David B.
N1 - Publisher Copyright:
© 2021 The Author(s). Published by Oxford University Press on behalf of the American Society for Nutrition.
PY - 2021/3/1
Y1 - 2021/3/1
N2 - The use of classic nonparametric tests (cNPTs), such as the Kruskal-Wallis and Mann-Whitney U tests, in the presence of unequal variance for between-group comparisons of means and medians may lead to marked increases in the rate of falsely rejecting null hypotheses and decreases in statistical power. Yet, this practice remains prevalent in the scientific literature, including nutrition and obesity literature. Some nutrition and obesity studies use a cNPT in the presence of unequal variance (i.e., heteroscedasticity), sometimes because of the mistaken rationale that the test corrects for heteroscedasticity. Herein, we discuss misconceptions of using cNPTs in the presence of heteroscedasticity. We then discuss assumptions, purposes, and limitations of 3 common tests used to test for mean differences between multiple groups, including 2 parametric tests: Fisher's ANOVA and Welch's ANOVA; and 1 cNPT: The Kruskal-Wallis test. To document the impact of heteroscedasticity on the validity of these tests under conditions similar to those used in nutrition and obesity research, we conducted simple simulations and assessed type I error rates (i.e., false positives, defined as incorrectly rejecting the null hypothesis). We demonstrate that type I error rates for Fisher's ANOVA, which does not account for heteroscedasticity, and Kruskal-Wallis, which tests for differences in distributions rather than means, deviated from the expected significance level. Greater deviation from the expected type I error rate was observed as the heterogeneity increased, especially in the presence of an imbalanced sample size. We provide brief tutorial guidance for authors, editors, and reviewers to identify appropriate statistical tests when test assumptions are violated, with a particular focus on cNPTs.
AB - The use of classic nonparametric tests (cNPTs), such as the Kruskal-Wallis and Mann-Whitney U tests, in the presence of unequal variance for between-group comparisons of means and medians may lead to marked increases in the rate of falsely rejecting null hypotheses and decreases in statistical power. Yet, this practice remains prevalent in the scientific literature, including nutrition and obesity literature. Some nutrition and obesity studies use a cNPT in the presence of unequal variance (i.e., heteroscedasticity), sometimes because of the mistaken rationale that the test corrects for heteroscedasticity. Herein, we discuss misconceptions of using cNPTs in the presence of heteroscedasticity. We then discuss assumptions, purposes, and limitations of 3 common tests used to test for mean differences between multiple groups, including 2 parametric tests: Fisher's ANOVA and Welch's ANOVA; and 1 cNPT: The Kruskal-Wallis test. To document the impact of heteroscedasticity on the validity of these tests under conditions similar to those used in nutrition and obesity research, we conducted simple simulations and assessed type I error rates (i.e., false positives, defined as incorrectly rejecting the null hypothesis). We demonstrate that type I error rates for Fisher's ANOVA, which does not account for heteroscedasticity, and Kruskal-Wallis, which tests for differences in distributions rather than means, deviated from the expected significance level. Greater deviation from the expected type I error rate was observed as the heterogeneity increased, especially in the presence of an imbalanced sample size. We provide brief tutorial guidance for authors, editors, and reviewers to identify appropriate statistical tests when test assumptions are violated, with a particular focus on cNPTs.
KW - association
KW - causation
KW - heteroscedasticity
KW - nonparametric tests
KW - nutrition
KW - obesity
KW - research rigor
KW - statistical methods
UR - http://www.scopus.com/inward/record.url?scp=85102906879&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102906879&partnerID=8YFLogxK
U2 - 10.1093/ajcn/nqaa357
DO - 10.1093/ajcn/nqaa357
M3 - Review article
C2 - 33515017
AN - SCOPUS:85102906879
SN - 0002-9165
VL - 113
SP - 517
EP - 524
JO - American Journal of Clinical Nutrition
JF - American Journal of Clinical Nutrition
IS - 3
ER -