Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology

Monika Rauch Henzinger, Valerie King, Tandy Warnow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We are given a set T = {T1, T2,⋯, Tk} of rooted binary trees, each Ti leaf-labeled by a subset L(Ti) ⊂ {1,2,⋯, n}. If Tis a tree on {1,2,⋯, n}, we let T|L denote the subtree of T induced by the nodes of C and all their ancestors. The consensus tree problem asks whether there exists a tree T such that for every i, T|L(Ti) is homeomorphic to Ti. We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time min{O(mn1/2), O(M + n2 log n)}, where m = Σi|Ti| and uses linear space. The randomized algorithm takes time O(m log3 n) and uses linear space. The previous best for this problem was an 1981 O(mn) algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph G with n nodes and m edges and a sequence of 6 batches of one or more edge deletions, then after each batch, either find a new component that has just been created or determine that there is no such component. For this problem, we have a simple algorithm with running time O(n2 log n + b0 min{n2, m log n}), where b0 is the number of batches which do not result in a new component. For our particular application, b0 ≤ 1. If all edges are deleted, then the best previously known deterministic algorithm requires time O(m√n) to solve this problem. We will also present two applications of these consensus tree algorithms which solve other problems in computational evolutionary biology. The first application is in the problem of inferring consensus of trees when there can be disagreement[16]. There have been several models suggested for this problem[2, 3, 4, 8, ?, 11, 17, 18], of which one is called the Local Consensus Tree [15]. The local consensus tree model presumes that the user provides a local consensus rule which determines the form of the output tree on (perhaps) each triple of leaves, and the objective is to determine whether a tree exists which is consistent with each of the constraints. We will show that we can construct the local consensus tree of k trees on n species in O(kn3) time, improving on the O(kn3 + n4) running time if we use the Aho et al algorithm. The second application is a heuristic for constructing the maximum likelihood tree based upon combining solutions to small subproblems. This is a simple and yet potentially significantly interesting approach to the evolutionary tree construction problem.

Original languageEnglish (US)
Title of host publicationProceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996
PublisherAssociation for Computing Machinery
Pages333-340
Number of pages8
ISBN (Electronic)0898713668
StatePublished - Jan 28 1996
Externally publishedYes
Event7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996 - Atlanta, United States
Duration: Jan 28 1996Jan 30 1996

Publication series

NameProceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
VolumePart F129447

Other

Other7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996
CountryUnited States
CityAtlanta
Period1/28/961/30/96

Fingerprint

Homeomorphic
Biology
Deterministic Algorithm
Batch
Linear Space
Leaves
Binary trees
Evolutionary Tree
K-tree
Dynamic Graphs
Rooted Trees
Tree Algorithms
Maximum likelihood
Binary Tree
Randomized Algorithms
Vertex of a graph
Deletion
Fast Algorithm
Maximum Likelihood
Efficient Algorithms

Keywords

  • Algorithms
  • Data structures
  • Evolutionary biology
  • Theory of databases

ASJC Scopus subject areas

  • Software
  • Mathematics(all)

Cite this

Henzinger, M. R., King, V., & Warnow, T. (1996). Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. In Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996 (pp. 333-340). (Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms; Vol. Part F129447). Association for Computing Machinery.

Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. / Henzinger, Monika Rauch; King, Valerie; Warnow, Tandy.

Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996. Association for Computing Machinery, 1996. p. 333-340 (Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms; Vol. Part F129447).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Henzinger, MR, King, V & Warnow, T 1996, Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. in Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996. Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms, vol. Part F129447, Association for Computing Machinery, pp. 333-340, 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996, Atlanta, United States, 1/28/96.
Henzinger MR, King V, Warnow T. Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. In Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996. Association for Computing Machinery. 1996. p. 333-340. (Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms).
Henzinger, Monika Rauch ; King, Valerie ; Warnow, Tandy. / Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996. Association for Computing Machinery, 1996. pp. 333-340 (Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms).
@inproceedings{6aa22a97ce0f4909aa0bc3657f071c7d,
title = "Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology",
abstract = "We are given a set T = {T1, T2,⋯, Tk} of rooted binary trees, each Ti leaf-labeled by a subset L(Ti) ⊂ {1,2,⋯, n}. If Tis a tree on {1,2,⋯, n}, we let T|L denote the subtree of T induced by the nodes of C and all their ancestors. The consensus tree problem asks whether there exists a tree T∗ such that for every i, T∗|L(Ti) is homeomorphic to Ti. We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time min{O(mn1/2), O(M + n2 log n)}, where m = Σi|Ti| and uses linear space. The randomized algorithm takes time O(m log3 n) and uses linear space. The previous best for this problem was an 1981 O(mn) algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph G with n nodes and m edges and a sequence of 6 batches of one or more edge deletions, then after each batch, either find a new component that has just been created or determine that there is no such component. For this problem, we have a simple algorithm with running time O(n2 log n + b0 min{n2, m log n}), where b0 is the number of batches which do not result in a new component. For our particular application, b0 ≤ 1. If all edges are deleted, then the best previously known deterministic algorithm requires time O(m√n) to solve this problem. We will also present two applications of these consensus tree algorithms which solve other problems in computational evolutionary biology. The first application is in the problem of inferring consensus of trees when there can be disagreement[16]. There have been several models suggested for this problem[2, 3, 4, 8, ?, 11, 17, 18], of which one is called the Local Consensus Tree [15]. The local consensus tree model presumes that the user provides a local consensus rule which determines the form of the output tree on (perhaps) each triple of leaves, and the objective is to determine whether a tree exists which is consistent with each of the constraints. We will show that we can construct the local consensus tree of k trees on n species in O(kn3) time, improving on the O(kn3 + n4) running time if we use the Aho et al algorithm. The second application is a heuristic for constructing the maximum likelihood tree based upon combining solutions to small subproblems. This is a simple and yet potentially significantly interesting approach to the evolutionary tree construction problem.",
keywords = "Algorithms, Data structures, Evolutionary biology, Theory of databases",
author = "Henzinger, {Monika Rauch} and Valerie King and Tandy Warnow",
year = "1996",
month = "1",
day = "28",
language = "English (US)",
series = "Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms",
publisher = "Association for Computing Machinery",
pages = "333--340",
booktitle = "Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996",

}

TY - GEN

T1 - Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology

AU - Henzinger, Monika Rauch

AU - King, Valerie

AU - Warnow, Tandy

PY - 1996/1/28

Y1 - 1996/1/28

N2 - We are given a set T = {T1, T2,⋯, Tk} of rooted binary trees, each Ti leaf-labeled by a subset L(Ti) ⊂ {1,2,⋯, n}. If Tis a tree on {1,2,⋯, n}, we let T|L denote the subtree of T induced by the nodes of C and all their ancestors. The consensus tree problem asks whether there exists a tree T∗ such that for every i, T∗|L(Ti) is homeomorphic to Ti. We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time min{O(mn1/2), O(M + n2 log n)}, where m = Σi|Ti| and uses linear space. The randomized algorithm takes time O(m log3 n) and uses linear space. The previous best for this problem was an 1981 O(mn) algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph G with n nodes and m edges and a sequence of 6 batches of one or more edge deletions, then after each batch, either find a new component that has just been created or determine that there is no such component. For this problem, we have a simple algorithm with running time O(n2 log n + b0 min{n2, m log n}), where b0 is the number of batches which do not result in a new component. For our particular application, b0 ≤ 1. If all edges are deleted, then the best previously known deterministic algorithm requires time O(m√n) to solve this problem. We will also present two applications of these consensus tree algorithms which solve other problems in computational evolutionary biology. The first application is in the problem of inferring consensus of trees when there can be disagreement[16]. There have been several models suggested for this problem[2, 3, 4, 8, ?, 11, 17, 18], of which one is called the Local Consensus Tree [15]. The local consensus tree model presumes that the user provides a local consensus rule which determines the form of the output tree on (perhaps) each triple of leaves, and the objective is to determine whether a tree exists which is consistent with each of the constraints. We will show that we can construct the local consensus tree of k trees on n species in O(kn3) time, improving on the O(kn3 + n4) running time if we use the Aho et al algorithm. The second application is a heuristic for constructing the maximum likelihood tree based upon combining solutions to small subproblems. This is a simple and yet potentially significantly interesting approach to the evolutionary tree construction problem.

AB - We are given a set T = {T1, T2,⋯, Tk} of rooted binary trees, each Ti leaf-labeled by a subset L(Ti) ⊂ {1,2,⋯, n}. If Tis a tree on {1,2,⋯, n}, we let T|L denote the subtree of T induced by the nodes of C and all their ancestors. The consensus tree problem asks whether there exists a tree T∗ such that for every i, T∗|L(Ti) is homeomorphic to Ti. We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time min{O(mn1/2), O(M + n2 log n)}, where m = Σi|Ti| and uses linear space. The randomized algorithm takes time O(m log3 n) and uses linear space. The previous best for this problem was an 1981 O(mn) algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph G with n nodes and m edges and a sequence of 6 batches of one or more edge deletions, then after each batch, either find a new component that has just been created or determine that there is no such component. For this problem, we have a simple algorithm with running time O(n2 log n + b0 min{n2, m log n}), where b0 is the number of batches which do not result in a new component. For our particular application, b0 ≤ 1. If all edges are deleted, then the best previously known deterministic algorithm requires time O(m√n) to solve this problem. We will also present two applications of these consensus tree algorithms which solve other problems in computational evolutionary biology. The first application is in the problem of inferring consensus of trees when there can be disagreement[16]. There have been several models suggested for this problem[2, 3, 4, 8, ?, 11, 17, 18], of which one is called the Local Consensus Tree [15]. The local consensus tree model presumes that the user provides a local consensus rule which determines the form of the output tree on (perhaps) each triple of leaves, and the objective is to determine whether a tree exists which is consistent with each of the constraints. We will show that we can construct the local consensus tree of k trees on n species in O(kn3) time, improving on the O(kn3 + n4) running time if we use the Aho et al algorithm. The second application is a heuristic for constructing the maximum likelihood tree based upon combining solutions to small subproblems. This is a simple and yet potentially significantly interesting approach to the evolutionary tree construction problem.

KW - Algorithms

KW - Data structures

KW - Evolutionary biology

KW - Theory of databases

UR - http://www.scopus.com/inward/record.url?scp=85002013940&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85002013940&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85002013940

T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms

SP - 333

EP - 340

BT - Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996

PB - Association for Computing Machinery

ER -