TY - GEN
T1 - Statistically Consistent Estimation of Rooted and Unrooted Level-1 Phylogenetic Networks from SNP Data
AU - Warnow, Tandy
AU - Tabatabaee, Yasamin
AU - Evans, Steven N.
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - We address the problem of estimating a rooted phylogenetic network, as well as its unrooted version, from SNPs (i.e., single nucleotide polymorphisms), allowing for multiple crossover events. Thus, each SNP is assumed to have evolved under the infinite sites assumption down some tree inside the phylogenetic network. We prove that level-1 phylogenetic networks can be reconstructed uniquely from any set of SNPs that cover all bipartitions of the rooted trees contained in the network, even when the ancestral state is unknown. To the best of our knowledge, this is the first result to establish that the unrooted topology of a level-1 network is uniquely recoverable from SNPs without known ancestral states. We present a stochastic model for DNA evolution, and we prove that Gusfield’s algorithms in JCSS 2005 (one for the case where the ancestral state is known, and the other when it is not known) can be used in polynomial time, statistically consistent pipelines to estimate level-1 phylogenetic networks when all cycles are of length at least five, under the stochastic model we propose, provided that we have access to an oracle for indicating which sites in the DNA alignment are SNPs.
AB - We address the problem of estimating a rooted phylogenetic network, as well as its unrooted version, from SNPs (i.e., single nucleotide polymorphisms), allowing for multiple crossover events. Thus, each SNP is assumed to have evolved under the infinite sites assumption down some tree inside the phylogenetic network. We prove that level-1 phylogenetic networks can be reconstructed uniquely from any set of SNPs that cover all bipartitions of the rooted trees contained in the network, even when the ancestral state is unknown. To the best of our knowledge, this is the first result to establish that the unrooted topology of a level-1 network is uniquely recoverable from SNPs without known ancestral states. We present a stochastic model for DNA evolution, and we prove that Gusfield’s algorithms in JCSS 2005 (one for the case where the ancestral state is known, and the other when it is not known) can be used in polynomial time, statistically consistent pipelines to estimate level-1 phylogenetic networks when all cycles are of length at least five, under the stochastic model we propose, provided that we have access to an oracle for indicating which sites in the DNA alignment are SNPs.
KW - galled tree
KW - level-1
KW - phylogenetic network
UR - http://www.scopus.com/inward/record.url?scp=85192237170&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85192237170&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-58072-7_1
DO - 10.1007/978-3-031-58072-7_1
M3 - Conference contribution
AN - SCOPUS:85192237170
SN - 9783031580710
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 23
BT - Comparative Genomics - 21st International Conference, RECOMB-CG 2024, Proceedings
A2 - Scornavacca, Celine
A2 - Hernández-Rosales, Maribel
PB - Springer
T2 - 21st RECOMB International Workshop on Comparative Genomics, RECOMB-CG 2024
Y2 - 27 April 2024 through 28 April 2024
ER -