Estimating the deviation from a molecular clock

Luay Nakhleh, Usman Roshan, Lisa Vawter, Tandy Warnow

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.

Original languageEnglish (US)
Title of host publicationAlgorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings
EditorsRoderic Guigo, Dan Gusfield
PublisherSpringer-Verlag
Pages287-299
Number of pages13
ISBN (Print)3540442111, 9783540442110
StatePublished - Jan 1 2002
Event2nd International Workshop on Algorithms in Bioinformatics, WABI 2002 - Rome, Italy
Duration: Sep 17 2002Sep 21 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2452
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other2nd International Workshop on Algorithms in Bioinformatics, WABI 2002
CountryItaly
CityRome
Period9/17/029/21/02

Fingerprint

Stretch
Clocks
Deviation
Polynomials
Maximum likelihood
Maximum Parsimony
Violate
Polynomial-time Algorithm
Maximum Likelihood
Polynomial time
Quantify
Lower bound
Computing
Approximation
Estimate
Standards

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Nakhleh, L., Roshan, U., Vawter, L., & Warnow, T. (2002). Estimating the deviation from a molecular clock. In R. Guigo, & D. Gusfield (Eds.), Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings (pp. 287-299). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2452). Springer-Verlag.

Estimating the deviation from a molecular clock. / Nakhleh, Luay; Roshan, Usman; Vawter, Lisa; Warnow, Tandy.

Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings. ed. / Roderic Guigo; Dan Gusfield. Springer-Verlag, 2002. p. 287-299 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2452).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nakhleh, L, Roshan, U, Vawter, L & Warnow, T 2002, Estimating the deviation from a molecular clock. in R Guigo & D Gusfield (eds), Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2452, Springer-Verlag, pp. 287-299, 2nd International Workshop on Algorithms in Bioinformatics, WABI 2002, Rome, Italy, 9/17/02.
Nakhleh L, Roshan U, Vawter L, Warnow T. Estimating the deviation from a molecular clock. In Guigo R, Gusfield D, editors, Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings. Springer-Verlag. 2002. p. 287-299. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Nakhleh, Luay ; Roshan, Usman ; Vawter, Lisa ; Warnow, Tandy. / Estimating the deviation from a molecular clock. Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings. editor / Roderic Guigo ; Dan Gusfield. Springer-Verlag, 2002. pp. 287-299 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{aeaae8f52f1b4fb998f6f9e3632a05bb,
title = "Estimating the deviation from a molecular clock",
abstract = "We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.",
author = "Luay Nakhleh and Usman Roshan and Lisa Vawter and Tandy Warnow",
year = "2002",
month = "1",
day = "1",
language = "English (US)",
isbn = "3540442111",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "287--299",
editor = "Roderic Guigo and Dan Gusfield",
booktitle = "Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings",

}

TY - GEN

T1 - Estimating the deviation from a molecular clock

AU - Nakhleh, Luay

AU - Roshan, Usman

AU - Vawter, Lisa

AU - Warnow, Tandy

PY - 2002/1/1

Y1 - 2002/1/1

N2 - We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.

AB - We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.

UR - http://www.scopus.com/inward/record.url?scp=3042644472&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=3042644472&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:3042644472

SN - 3540442111

SN - 9783540442110

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 287

EP - 299

BT - Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings

A2 - Guigo, Roderic

A2 - Gusfield, Dan

PB - Springer-Verlag

ER -