Hypothesis testing for phylogenetic composition: a minimum-cost flow perspective

Shulei Wang, T. Tony Cai, Hongzhe Li

Research output: Contribution to journalArticlepeer-review

Abstract

Quantitative comparison of microbial composition from different populations is a fundamental task in various microbiome studies. We consider two-sample testing for microbial compositional data by leveraging phylogenetic information. Motivated by existing phylogenetic distances, we take a minimum-cost flow perspective to study such testing problems. We first show that multivariate analysis of variance with permutation using phylogenetic distances, one of the most commonly used methods in practice, is essentially a sum-of-squares type of test and has better power for dense alternatives. However, empirical evidence from real datasets suggests that the phylogenetic microbial composition difference between two populations is usually sparse. Motivated by this observation, we propose a new maximum type test, detector of active flow on a tree, and investigate its properties. We show that the proposed method is particularly powerful against sparse phylogenetic composition difference and enjoys certain optimality. The practical merit of the proposed method is demonstrated by simulation studies and an application to a human intestinal biopsy microbiome dataset on patients with ulcerative colitis.

Original languageEnglish (US)
Article numberasaa061
Pages (from-to)17-36
Number of pages20
JournalBiometrika
Volume108
Issue number1
DOIs
StatePublished - Mar 1 2021
Externally publishedYes

Keywords

  • Metagenomics
  • Microbiome
  • Phylogenetic tree
  • Sparse alternative
  • Wasserstein distance

ASJC Scopus subject areas

  • General Agricultural and Biological Sciences
  • Applied Mathematics
  • General Mathematics
  • Agricultural and Biological Sciences (miscellaneous)
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Hypothesis testing for phylogenetic composition: a minimum-cost flow perspective'. Together they form a unique fingerprint.

Cite this