Statistically Consistent Estimation of Rooted and Unrooted Level-1 Phylogenetic Networks from SNP Data

Tandy Warnow, Yasamin Tabatabaee, Steven N. Evans

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We address the problem of estimating a rooted phylogenetic network, as well as its unrooted version, from SNPs (i.e., single nucleotide polymorphisms), allowing for multiple crossover events. Thus, each SNP is assumed to have evolved under the infinite sites assumption down some tree inside the phylogenetic network. We prove that level-1 phylogenetic networks can be reconstructed uniquely from any set of SNPs that cover all bipartitions of the rooted trees contained in the network, even when the ancestral state is unknown. To the best of our knowledge, this is the first result to establish that the unrooted topology of a level-1 network is uniquely recoverable from SNPs without known ancestral states. We present a stochastic model for DNA evolution, and we prove that Gusfield’s algorithms in JCSS 2005 (one for the case where the ancestral state is known, and the other when it is not known) can be used in polynomial time, statistically consistent pipelines to estimate level-1 phylogenetic networks when all cycles are of length at least five, under the stochastic model we propose, provided that we have access to an oracle for indicating which sites in the DNA alignment are SNPs.

Original languageEnglish (US)
Title of host publicationComparative Genomics - 21st International Conference, RECOMB-CG 2024, Proceedings
EditorsCeline Scornavacca, Maribel Hernández-Rosales
PublisherSpringer
Pages3-23
Number of pages21
ISBN (Print)9783031580710
DOIs
StatePublished - 2024
Event21st RECOMB International Workshop on Comparative Genomics, RECOMB-CG 2024 - Boston, United States
Duration: Apr 27 2024Apr 28 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14616 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st RECOMB International Workshop on Comparative Genomics, RECOMB-CG 2024
Country/TerritoryUnited States
CityBoston
Period4/27/244/28/24

Keywords

  • galled tree
  • level-1
  • phylogenetic network

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Statistically Consistent Estimation of Rooted and Unrooted Level-1 Phylogenetic Networks from SNP Data'. Together they form a unique fingerprint.

Cite this