Abstract
Phylogenomics-the estimation of species trees from multilocus data sets-is a common step in many biological studies. However, this estimation is challenged by the fact that genes can evolve under processes, including incomplete lineage sorting (ILS) and gene duplication and loss (GDL), that make their trees different from the species tree. In this article, we address the challenge of estimating the species tree under GDL. We show that species trees are identifiable under a standard stochastic model for GDL, and that the polynomial-time algorithm ASTRAL-multi, a recent development in the ASTRAL suite of methods, is statistically consistent under this GDL model. We also provide a simulation study evaluating ASTRAL-multi for species tree estimation under GDL.
Original language | English (US) |
---|---|
Pages (from-to) | 452-468 |
Number of pages | 17 |
Journal | Journal of computational biology : a journal of computational molecular cell biology |
Volume | 28 |
Issue number | 5 |
Early online date | Dec 15 2020 |
DOIs | |
State | Published - May 2021 |
Keywords
- ASTRAL
- estimation
- gene duplication and loss
- identifiability
- species trees
- statistical consistency
ASJC Scopus subject areas
- Computational Mathematics
- Genetics
- Molecular Biology
- Computational Theory and Mathematics
- Modeling and Simulation