Recent progress on methods for estimating and updating large phylogenies

Paul Zaharias, Tandy Warnow

Research output: Contribution to journalReview articlepeer-review


With the increased availability of sequence data and even of fully sequenced and assembled genomes, phylogeny estimation of very large trees (even of hundreds of thousands of sequences) is now a goal for some biologists. Yet, the construction of these phylogenies is a complex pipeline presenting analytical and computational challenges, especially when the number of sequences is very large. In the past few years, new methods have been developed that aim to enable highly accurate phylogeny estimations on these large datasets, including divide-and-conquer techniques for multiple sequence alignment and/or tree estimation, methods that can estimate species trees from multi-locus datasets while addressing heterogeneity due to biological processes (e.g. incomplete lineage sorting and gene duplication and loss), and methods to add sequences into large gene trees or species trees. Here we present some of these recent advances and discuss opportunities for future improvements. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.

Original languageEnglish (US)
Article number20210244
JournalPhilosophical Transactions of the Royal Society B: Biological Sciences
Issue number1861
StatePublished - Oct 10 2022
Externally publishedYes


  • maximum likelihood
  • multiple sequence alignment
  • phylogenetic placement
  • phylogenomics
  • phylogeny estimation
  • taxon identification

ASJC Scopus subject areas

  • General Biochemistry, Genetics and Molecular Biology
  • General Agricultural and Biological Sciences


Dive into the research topics of 'Recent progress on methods for estimating and updating large phylogenies'. Together they form a unique fingerprint.

Cite this