Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss

Brandon Legried, Erin K Molloy, Tandy Warnow, Sébastien Roch

Research output: Contribution to journalArticlepeer-review

Abstract

Phylogenomics-the estimation of species trees from multilocus data sets-is a common step in many biological studies. However, this estimation is challenged by the fact that genes can evolve under processes, including incomplete lineage sorting (ILS) and gene duplication and loss (GDL), that make their trees different from the species tree. In this article, we address the challenge of estimating the species tree under GDL. We show that species trees are identifiable under a standard stochastic model for GDL, and that the polynomial-time algorithm ASTRAL-multi, a recent development in the ASTRAL suite of methods, is statistically consistent under this GDL model. We also provide a simulation study evaluating ASTRAL-multi for species tree estimation under GDL.

Original languageEnglish (US)
Pages (from-to)452-468
Number of pages17
JournalJournal of computational biology : a journal of computational molecular cell biology
Volume28
Issue number5
Early online dateDec 15 2020
DOIs
StatePublished - May 2021

Keywords

  • ASTRAL
  • estimation
  • gene duplication and loss
  • identifiability
  • species trees
  • statistical consistency

ASJC Scopus subject areas

  • Computational Mathematics
  • Genetics
  • Molecular Biology
  • Computational Theory and Mathematics
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss'. Together they form a unique fingerprint.

Cite this