TY - GEN

T1 - A robust model for finding optimal evolutionary trees extended abstract

AU - Farach, Martin

AU - Kannan, Sampath

AU - Warnow, Tandy

N1 - Funding Information:
l DIMACS, Box 1179, Rutgers University, P1scataway, NJ 08855; (908) 932.5928, farach@dimacs rutgers,edu, Supported by DIMACS under NSF contract STC-S8-0964S, tDepartment of Computer Science, tJnlversitY Tucson, AZ 85721, (602) 621-4817, kannan@cs. Supported by NSF Grant CCR-9108969 ~Algorithms and Discrete Mathematics dla National Labs Albuquerque, NM, twarnow@cs.sandia. gov. Thm work was begun thor was wslting DIMACS m July and August, supported m part by the U,S Department contract DE-AC04.76DPO07S9

PY - 1993/6/1

Y1 - 1993/6/1

N2 - Constructing evolutionary trees for species sets is a fundamental problem in computational biology. One of the standard models assumes the ability to compute distances between every pair of species and seeks to find an edge-weighted tree T in which the distance dijT in the tree between the leaves of T corresponding to the species i and j fits the observed distance, dij. Sometimes the desired tree is ultrametric, so that the tree can be rooted with the root equidistant to each leaf. Many measures for evaluating the `fit' between a distance function d and the path-distance dT have been proposed, and most such measures have resulted in NP-hard optimization problems. In this paper we propose a measure of fit which models the inaccuracy in the data, and present several problems for constructing additive and ultrametric trees using this measure. Many of the resultant optimization problems are NP-hard, and one (finding a minimum size ultrametric tree which increments from an input matrix) is hard to approximate. Specifically, there is a constant c>0 such that unless P = NP, no polynomial time algorithm can exist which finds an approximate solution within the ratio of nc. However, we also present tight upper and lower bounds for the L∞-Minimum Increment to Ultrametric. Thus, we present perhaps the first algorithm for constructing phylogenetic trees from distance matrices which finds optimal trees for a reasonable criterion in polynomial time.

AB - Constructing evolutionary trees for species sets is a fundamental problem in computational biology. One of the standard models assumes the ability to compute distances between every pair of species and seeks to find an edge-weighted tree T in which the distance dijT in the tree between the leaves of T corresponding to the species i and j fits the observed distance, dij. Sometimes the desired tree is ultrametric, so that the tree can be rooted with the root equidistant to each leaf. Many measures for evaluating the `fit' between a distance function d and the path-distance dT have been proposed, and most such measures have resulted in NP-hard optimization problems. In this paper we propose a measure of fit which models the inaccuracy in the data, and present several problems for constructing additive and ultrametric trees using this measure. Many of the resultant optimization problems are NP-hard, and one (finding a minimum size ultrametric tree which increments from an input matrix) is hard to approximate. Specifically, there is a constant c>0 such that unless P = NP, no polynomial time algorithm can exist which finds an approximate solution within the ratio of nc. However, we also present tight upper and lower bounds for the L∞-Minimum Increment to Ultrametric. Thus, we present perhaps the first algorithm for constructing phylogenetic trees from distance matrices which finds optimal trees for a reasonable criterion in polynomial time.

UR - http://www.scopus.com/inward/record.url?scp=0027307380&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0027307380&partnerID=8YFLogxK

U2 - 10.1145/167088.167132

DO - 10.1145/167088.167132

M3 - Conference contribution

AN - SCOPUS:0027307380

T3 - Proceedings of the Annual ACM Symposium on Theory of Computing

SP - 137

EP - 145

BT - Proceedings of the 25th Annual ACM Symposium on Theory of Computing, STOC 1993

PB - Association for Computing Machinery

T2 - 25th Annual ACM Symposium on Theory of Computing, STOC 1993

Y2 - 16 May 1993 through 18 May 1993

ER -