TY - GEN
T1 - A fast algorithm for the computation and enumeration of perfect phylogenies when the number of character states is fixed
AU - Kannan, Sampath
AU - Warnow, Tandy
N1 - Funding Information:
in part by NSF grant CCR-9108969.
Funding Information:
Supported in part by NSF grant CCR-9108969. Supported in part by ARO grant DAAL03-89-0031PRI
PY - 1995/1/22
Y1 - 1995/1/22
N2 - The Perfect Phylogeny Problem is a classical problem in computational evolutionary biology, in which a set of species/taxa is described by a set of qualitative characters. In recent years, the problem has been shown to be NP-Complete in general, while the different fixed parameter versions can each be solved in polynomial time. In particular, Agarwala and Fernandez-Baca have developed an O(23r (nk3 + k4)) algorithm for the perfect phylogeny problem for n species defined by k r-state characters. Since commonly the character data is drawn from alignments of molecular sequences, k is the length of the sequences and can thus be very large (in the hundreds or thousands). Thus, it is imperative to develop algorithms which run efficiently for large values of k. In this paper we make additional observations about the structure of the problem and produce an algorithm for the problem that runs in time O(22rk2n). We also show how it is possible to efficiently build a structure that implicitly represents the set of all perfect phylogenies, and to randomly sample from that set.
AB - The Perfect Phylogeny Problem is a classical problem in computational evolutionary biology, in which a set of species/taxa is described by a set of qualitative characters. In recent years, the problem has been shown to be NP-Complete in general, while the different fixed parameter versions can each be solved in polynomial time. In particular, Agarwala and Fernandez-Baca have developed an O(23r (nk3 + k4)) algorithm for the perfect phylogeny problem for n species defined by k r-state characters. Since commonly the character data is drawn from alignments of molecular sequences, k is the length of the sequences and can thus be very large (in the hundreds or thousands). Thus, it is imperative to develop algorithms which run efficiently for large values of k. In this paper we make additional observations about the structure of the problem and produce an algorithm for the problem that runs in time O(22rk2n). We also show how it is possible to efficiently build a structure that implicitly represents the set of all perfect phylogenies, and to randomly sample from that set.
UR - http://www.scopus.com/inward/record.url?scp=85010171599&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85010171599&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85010171599
T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
SP - 595
EP - 603
BT - Proceedings of the 6th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1995
PB - Association for Computing Machinery
T2 - 6th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1995
Y2 - 22 January 1995 through 24 January 1995
ER -