Using INC Within Divide-and-Conquer Phylogeny Estimation

Thien Le, Aaron Sy, Erin K. Molloy, Qiuyi (Richard) Zhang, Satish Rao, Tandy Warnow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In a recent paper (Zhang, Rao, and Warnow, Algorithms for Molecular Biology 2019), the INC (incremental tree building) algorithm was presented and proven to be absolute fast converging under standard sequence evolution models. A variant of INC which allows a set of disjoint constraint trees to be provided and then uses INC to merge the constraint trees was also presented (i.e., Constrained INC). We report on a study evaluating INC on a range of simulated datasets, and show that it has very poor accuracy in comparison to standard methods. We also explore the design space for divide-and-conquer strategies for phylogeny estimation that use Constrained INC, and show modifications that provide improved accuracy. In particular, we present INC-ML, a divide-and-conquer approach to maximum likelihood (ML) estimation that comes close to the leading ML heuristics in terms of accuracy, and is more accurate than the current best distance-based methods.

Original languageEnglish (US)
Title of host publicationAlgorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings
EditorsMiguel A. Vega-Rodríguez, Ian Holmes, Carlos Martín-Vide
PublisherSpringer-Verlag
Pages167-178
Number of pages12
ISBN (Print)9783030181734
DOIs
StatePublished - Jan 1 2019
Event6th International Conference on Algorithms for Computational Biology, AlCoB 2019 - Berkeley, United States
Duration: May 28 2019May 30 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11488 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Conference on Algorithms for Computational Biology, AlCoB 2019
CountryUnited States
CityBerkeley
Period5/28/195/30/19

Fingerprint

Phylogeny
Divide and conquer
Maximum likelihood
Molecular biology
Maximum Likelihood
Maximum likelihood estimation
Molecular Biology
Maximum Likelihood Estimation
Disjoint
Heuristics
Range of data
Standards
Model

Keywords

  • Divide-and-conquer
  • Inferring the evolutionary phylogeny of species
  • Maximum likelihood
  • Phylogeny estimation
  • Sample complexity

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Le, T., Sy, A., Molloy, E. K., Zhang, Q. R., Rao, S., & Warnow, T. (2019). Using INC Within Divide-and-Conquer Phylogeny Estimation. In M. A. Vega-Rodríguez, I. Holmes, & C. Martín-Vide (Eds.), Algorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings (pp. 167-178). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11488 LNBI). Springer-Verlag. https://doi.org/10.1007/978-3-030-18174-1_12

Using INC Within Divide-and-Conquer Phylogeny Estimation. / Le, Thien; Sy, Aaron; Molloy, Erin K.; Zhang, Qiuyi (Richard); Rao, Satish; Warnow, Tandy.

Algorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings. ed. / Miguel A. Vega-Rodríguez; Ian Holmes; Carlos Martín-Vide. Springer-Verlag, 2019. p. 167-178 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11488 LNBI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Le, T, Sy, A, Molloy, EK, Zhang, QR, Rao, S & Warnow, T 2019, Using INC Within Divide-and-Conquer Phylogeny Estimation. in MA Vega-Rodríguez, I Holmes & C Martín-Vide (eds), Algorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11488 LNBI, Springer-Verlag, pp. 167-178, 6th International Conference on Algorithms for Computational Biology, AlCoB 2019, Berkeley, United States, 5/28/19. https://doi.org/10.1007/978-3-030-18174-1_12
Le T, Sy A, Molloy EK, Zhang QR, Rao S, Warnow T. Using INC Within Divide-and-Conquer Phylogeny Estimation. In Vega-Rodríguez MA, Holmes I, Martín-Vide C, editors, Algorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings. Springer-Verlag. 2019. p. 167-178. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-18174-1_12
Le, Thien ; Sy, Aaron ; Molloy, Erin K. ; Zhang, Qiuyi (Richard) ; Rao, Satish ; Warnow, Tandy. / Using INC Within Divide-and-Conquer Phylogeny Estimation. Algorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings. editor / Miguel A. Vega-Rodríguez ; Ian Holmes ; Carlos Martín-Vide. Springer-Verlag, 2019. pp. 167-178 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{454b6158ed0f431bbb92083b403faa6a,
title = "Using INC Within Divide-and-Conquer Phylogeny Estimation",
abstract = "In a recent paper (Zhang, Rao, and Warnow, Algorithms for Molecular Biology 2019), the INC (incremental tree building) algorithm was presented and proven to be absolute fast converging under standard sequence evolution models. A variant of INC which allows a set of disjoint constraint trees to be provided and then uses INC to merge the constraint trees was also presented (i.e., Constrained INC). We report on a study evaluating INC on a range of simulated datasets, and show that it has very poor accuracy in comparison to standard methods. We also explore the design space for divide-and-conquer strategies for phylogeny estimation that use Constrained INC, and show modifications that provide improved accuracy. In particular, we present INC-ML, a divide-and-conquer approach to maximum likelihood (ML) estimation that comes close to the leading ML heuristics in terms of accuracy, and is more accurate than the current best distance-based methods.",
keywords = "Divide-and-conquer, Inferring the evolutionary phylogeny of species, Maximum likelihood, Phylogeny estimation, Sample complexity",
author = "Thien Le and Aaron Sy and Molloy, {Erin K.} and Zhang, {Qiuyi (Richard)} and Satish Rao and Tandy Warnow",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-18174-1_12",
language = "English (US)",
isbn = "9783030181734",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "167--178",
editor = "Vega-Rodr{\'i}guez, {Miguel A.} and Ian Holmes and Carlos Mart{\'i}n-Vide",
booktitle = "Algorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings",

}

TY - GEN

T1 - Using INC Within Divide-and-Conquer Phylogeny Estimation

AU - Le, Thien

AU - Sy, Aaron

AU - Molloy, Erin K.

AU - Zhang, Qiuyi (Richard)

AU - Rao, Satish

AU - Warnow, Tandy

PY - 2019/1/1

Y1 - 2019/1/1

N2 - In a recent paper (Zhang, Rao, and Warnow, Algorithms for Molecular Biology 2019), the INC (incremental tree building) algorithm was presented and proven to be absolute fast converging under standard sequence evolution models. A variant of INC which allows a set of disjoint constraint trees to be provided and then uses INC to merge the constraint trees was also presented (i.e., Constrained INC). We report on a study evaluating INC on a range of simulated datasets, and show that it has very poor accuracy in comparison to standard methods. We also explore the design space for divide-and-conquer strategies for phylogeny estimation that use Constrained INC, and show modifications that provide improved accuracy. In particular, we present INC-ML, a divide-and-conquer approach to maximum likelihood (ML) estimation that comes close to the leading ML heuristics in terms of accuracy, and is more accurate than the current best distance-based methods.

AB - In a recent paper (Zhang, Rao, and Warnow, Algorithms for Molecular Biology 2019), the INC (incremental tree building) algorithm was presented and proven to be absolute fast converging under standard sequence evolution models. A variant of INC which allows a set of disjoint constraint trees to be provided and then uses INC to merge the constraint trees was also presented (i.e., Constrained INC). We report on a study evaluating INC on a range of simulated datasets, and show that it has very poor accuracy in comparison to standard methods. We also explore the design space for divide-and-conquer strategies for phylogeny estimation that use Constrained INC, and show modifications that provide improved accuracy. In particular, we present INC-ML, a divide-and-conquer approach to maximum likelihood (ML) estimation that comes close to the leading ML heuristics in terms of accuracy, and is more accurate than the current best distance-based methods.

KW - Divide-and-conquer

KW - Inferring the evolutionary phylogeny of species

KW - Maximum likelihood

KW - Phylogeny estimation

KW - Sample complexity

UR - http://www.scopus.com/inward/record.url?scp=85066130811&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066130811&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-18174-1_12

DO - 10.1007/978-3-030-18174-1_12

M3 - Conference contribution

AN - SCOPUS:85066130811

SN - 9783030181734

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 167

EP - 178

BT - Algorithms for Computational Biology - 6th International Conference, AlCoB 2019, Proceedings

A2 - Vega-Rodríguez, Miguel A.

A2 - Holmes, Ian

A2 - Martín-Vide, Carlos

PB - Springer-Verlag

ER -