### Abstract

We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.

Original language | English (US) |
---|---|

Title of host publication | Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings |

Editors | Roderic Guigo, Dan Gusfield |

Publisher | Springer-Verlag |

Pages | 287-299 |

Number of pages | 13 |

ISBN (Print) | 3540442111, 9783540442110 |

State | Published - Jan 1 2002 |

Event | 2nd International Workshop on Algorithms in Bioinformatics, WABI 2002 - Rome, Italy Duration: Sep 17 2002 → Sep 21 2002 |

### Publication series

Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|

Volume | 2452 |

ISSN (Print) | 0302-9743 |

ISSN (Electronic) | 1611-3349 |

### Other

Other | 2nd International Workshop on Algorithms in Bioinformatics, WABI 2002 |
---|---|

Country | Italy |

City | Rome |

Period | 9/17/02 → 9/21/02 |

### Fingerprint

### ASJC Scopus subject areas

- Theoretical Computer Science
- Computer Science(all)

### Cite this

*Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings*(pp. 287-299). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2452). Springer-Verlag.

**Estimating the deviation from a molecular clock.** / Nakhleh, Luay; Roshan, Usman; Vawter, Lisa; Warnow, Tandy.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings.*Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2452, Springer-Verlag, pp. 287-299, 2nd International Workshop on Algorithms in Bioinformatics, WABI 2002, Rome, Italy, 9/17/02.

}

TY - GEN

T1 - Estimating the deviation from a molecular clock

AU - Nakhleh, Luay

AU - Roshan, Usman

AU - Vawter, Lisa

AU - Warnow, Tandy

PY - 2002/1/1

Y1 - 2002/1/1

N2 - We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.

AB - We address the problem of estimating the degree to which the evolutionary history of a set of molecular sequences violates a strong molecular clock hypothesis.We quantify this deviation formally, by defining the “stretch” of a model tree with respect to the underlying ultrametric tree (indicated by time). We then define the “minimum stretch” of a dataset for a tree and show how this can be computed optimally in polynomial time. We also present a polynomial-time algorithm for computing a lower bound on the stretch of a given dataset for any tree. We then explore the performance of standard techniques in systematics for estimating the deviation of a dataset from a molecular clock. We show that standard methods, whether based upon maximum parsimony or maximum likelihood, can return infeasible values (i.e. values for the stretch which cannot be realized on a tree), and often under-estimate the true stretch. This suggests that current approximations of the degree to which data sets deviate from a molecular clock may significantly underestimate these deviations. We conclude with some suggestions for further research.

UR - http://www.scopus.com/inward/record.url?scp=3042644472&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=3042644472&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:3042644472

SN - 3540442111

SN - 9783540442110

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 287

EP - 299

BT - Algorithms in Bioinformatics - 2nd International Workshop,WABI 2002, Proceedings

A2 - Guigo, Roderic

A2 - Gusfield, Dan

PB - Springer-Verlag

ER -