TY - GEN
T1 - Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology
AU - Henzinger, Monika Rauch
AU - King, Valerie
AU - Warnow, Tandy
N1 - Funding Information:
Tartment of Computer Science, Cornell University, Ithaca, NY. This work was done in part which visiting at the International Computer Science Institute, Berkeley, CA. This work was supported in part by a NSF CAREER Advancement Award. Email: mhr@cs.cornell.edu. tDepartment of Computer Science, University Victoria, BC. Email: val&sr.uvic.ca. *Department of Computer and Information Science, University of Pennsylvania Philadelphia, PA. Email: tandy@central.cis.upenn.edu. This work was supported in part by a National Young Investigator Award from NSF.
PY - 1996/1/28
Y1 - 1996/1/28
N2 - We are given a set T = {T1, T2,⋯, Tk} of rooted binary trees, each Ti leaf-labeled by a subset L(Ti) ⊂ {1,2,⋯, n}. If Tis a tree on {1,2,⋯, n}, we let T|L denote the subtree of T induced by the nodes of C and all their ancestors. The consensus tree problem asks whether there exists a tree T∗ such that for every i, T∗|L(Ti) is homeomorphic to Ti. We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time min{O(mn1/2), O(M + n2 log n)}, where m = Σi|Ti| and uses linear space. The randomized algorithm takes time O(m log3 n) and uses linear space. The previous best for this problem was an 1981 O(mn) algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph G with n nodes and m edges and a sequence of 6 batches of one or more edge deletions, then after each batch, either find a new component that has just been created or determine that there is no such component. For this problem, we have a simple algorithm with running time O(n2 log n + b0 min{n2, m log n}), where b0 is the number of batches which do not result in a new component. For our particular application, b0 ≤ 1. If all edges are deleted, then the best previously known deterministic algorithm requires time O(m√n) to solve this problem. We will also present two applications of these consensus tree algorithms which solve other problems in computational evolutionary biology. The first application is in the problem of inferring consensus of trees when there can be disagreement[16]. There have been several models suggested for this problem[2, 3, 4, 8, ?, 11, 17, 18], of which one is called the Local Consensus Tree [15]. The local consensus tree model presumes that the user provides a local consensus rule which determines the form of the output tree on (perhaps) each triple of leaves, and the objective is to determine whether a tree exists which is consistent with each of the constraints. We will show that we can construct the local consensus tree of k trees on n species in O(kn3) time, improving on the O(kn3 + n4) running time if we use the Aho et al algorithm. The second application is a heuristic for constructing the maximum likelihood tree based upon combining solutions to small subproblems. This is a simple and yet potentially significantly interesting approach to the evolutionary tree construction problem.
AB - We are given a set T = {T1, T2,⋯, Tk} of rooted binary trees, each Ti leaf-labeled by a subset L(Ti) ⊂ {1,2,⋯, n}. If Tis a tree on {1,2,⋯, n}, we let T|L denote the subtree of T induced by the nodes of C and all their ancestors. The consensus tree problem asks whether there exists a tree T∗ such that for every i, T∗|L(Ti) is homeomorphic to Ti. We present algorithms which test if a given set of trees has a consensus tree and if so, construct one. The deterministic algorithm takes time min{O(mn1/2), O(M + n2 log n)}, where m = Σi|Ti| and uses linear space. The randomized algorithm takes time O(m log3 n) and uses linear space. The previous best for this problem was an 1981 O(mn) algorithm by Aho et al. Our faster deterministic algorithm uses a new efficient algorithm for the following interesting dynamic graph problem: Given a graph G with n nodes and m edges and a sequence of 6 batches of one or more edge deletions, then after each batch, either find a new component that has just been created or determine that there is no such component. For this problem, we have a simple algorithm with running time O(n2 log n + b0 min{n2, m log n}), where b0 is the number of batches which do not result in a new component. For our particular application, b0 ≤ 1. If all edges are deleted, then the best previously known deterministic algorithm requires time O(m√n) to solve this problem. We will also present two applications of these consensus tree algorithms which solve other problems in computational evolutionary biology. The first application is in the problem of inferring consensus of trees when there can be disagreement[16]. There have been several models suggested for this problem[2, 3, 4, 8, ?, 11, 17, 18], of which one is called the Local Consensus Tree [15]. The local consensus tree model presumes that the user provides a local consensus rule which determines the form of the output tree on (perhaps) each triple of leaves, and the objective is to determine whether a tree exists which is consistent with each of the constraints. We will show that we can construct the local consensus tree of k trees on n species in O(kn3) time, improving on the O(kn3 + n4) running time if we use the Aho et al algorithm. The second application is a heuristic for constructing the maximum likelihood tree based upon combining solutions to small subproblems. This is a simple and yet potentially significantly interesting approach to the evolutionary tree construction problem.
KW - Algorithms
KW - Data structures
KW - Evolutionary biology
KW - Theory of databases
UR - http://www.scopus.com/inward/record.url?scp=85002013940&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85002013940&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85002013940
T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
SP - 333
EP - 340
BT - Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996
PB - Association for Computing Machinery
T2 - 7th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1996
Y2 - 28 January 1996 through 30 January 1996
ER -