TY - JOUR
T1 - Unidentifiable divergence times in rates-across-sites models
AU - Evans, Steven N.
AU - Warnow, Tandy
N1 - Funding Information:
The authors would like to thank the anonymous referees for a number of very helpful suggestions and observations. T. Warnow acknowledges the support of the Program for Evolutionary Dynamics at Harvard and the Institute for Cellular and Molecular Biology at UT-Austin. S.N. Evans’ work on this paper was supported in part by US National Science Foundation grants DMS-0071468 and DMS-0405778. T. Warnow’s work on this paper was supported by US National Science Foundation grants EF-0331453, BCS-0312830, and IIS-0121680.
PY - 2004/7
Y1 - 2004/7
N2 - The rates-across-sites assumption in phylogenetic inference posits that the rate matrix governing the Markovian evolution of a character on an edge of the putative phylogenetic tree is the product of a character-specific scale factor and a rate matrix that is particular to that edge. Thus, evolution follows basically the same process for all characters, except that it occurs faster for some characters than others. To allow estimation of tree topologies and edge lengths for such models, it is commonly assumed that the scale factors are not arbitrary unknown constants, but rather unobserved, independent, identically distributed draws from a member of some parametric family of distributions. A popular choice is the gamma family. We consider an example of a clock-like tree with three taxa, one unknown edge length, a known root state, and a parametric family of scale factor distributions that contains the gamma family. This model has the property that, for a generic choice of unknown edge length and scale factor distribution, there is another edge length and scale factor distribution which generates data with exactly the same distribution, so that even with infinitely many data it will be typically impossible to make correct inferences about the unknown edge length
AB - The rates-across-sites assumption in phylogenetic inference posits that the rate matrix governing the Markovian evolution of a character on an edge of the putative phylogenetic tree is the product of a character-specific scale factor and a rate matrix that is particular to that edge. Thus, evolution follows basically the same process for all characters, except that it occurs faster for some characters than others. To allow estimation of tree topologies and edge lengths for such models, it is commonly assumed that the scale factors are not arbitrary unknown constants, but rather unobserved, independent, identically distributed draws from a member of some parametric family of distributions. A popular choice is the gamma family. We consider an example of a clock-like tree with three taxa, one unknown edge length, a known root state, and a parametric family of scale factor distributions that contains the gamma family. This model has the property that, for a generic choice of unknown edge length and scale factor distribution, there is another edge length and scale factor distribution which generates data with exactly the same distribution, so that even with infinitely many data it will be typically impossible to make correct inferences about the unknown edge length
KW - Gamma distribution
KW - Identifiability
KW - Phylogenetic inference
KW - Random effects
UR - http://www.scopus.com/inward/record.url?scp=14744278584&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=14744278584&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2004.34
DO - 10.1109/TCBB.2004.34
M3 - Article
C2 - 17048388
AN - SCOPUS:14744278584
SN - 1545-5963
VL - 1
SP - 130
EP - 134
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 3
ER -