TY - GEN
T1 - Estimating the deviation from a molecular clock
AU - Nakhleh, Luay
AU - Roshan, Usman
AU - Vawter, Lisa
AU - Warnow, Tandy
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2002.
PY - 2002
Y1 - 2002
N2 - We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.
AB - We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.
UR - http://www.scopus.com/inward/record.url?scp=3042644472&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=3042644472&partnerID=8YFLogxK
U2 - 10.1007/3-540-45784-4_22
DO - 10.1007/3-540-45784-4_22
M3 - Conference contribution
AN - SCOPUS:3042644472
SN - 3540442111
SN - 9783540442110
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 287
EP - 299
BT - Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings
A2 - Guigo, Roderic
A2 - Gusfield, Dan
PB - Springer
T2 - 2nd International Workshop on Algorithms in Bioinformatics, WABI 2002
Y2 - 17 September 2002 through 21 September 2002
ER -