Abstract
Motivation: Genes evolve under processes such as gene duplication and loss (GDL), so that gene family trees are multi-copy, as well as incomplete lineage sorting (ILS); both processes produce gene trees that differ from the species tree. The estimation of species trees from sets of gene family trees is challenging, and the estimation of rooted species trees presents additional analytical challenges. Two of the methods developed for this problem are STRIDE, which roots species trees by considering GDL events, and Quintet Rooting (QR), which roots species trees by considering ILS. Results: We present DISCO+QR, a new approach to rooting species trees that first uses DISCO to address GDL and then uses QR to perform rooting in the presence of ILS. DISCO+QR operates by taking the input gene family trees and decomposing them into single-copy trees using DISCO and then roots the given species tree using the information in the single-copy gene trees using QR. We show that the relative accuracy of STRIDE and DISCO+QR depend on the properties of the dataset (number of species, genes, rate of gene duplication, degree of ILS and gene tree estimation error), and that each provides advantages over the other under some conditions.
Original language | English (US) |
---|---|
Article number | vbad015 |
Journal | Bioinformatics Advances |
Volume | 3 |
Issue number | 1 |
DOIs | |
State | Published - Jan 5 2023 |
ASJC Scopus subject areas
- Genetics
- Molecular Biology
- Structural Biology
- Computer Science Applications
Fingerprint
Dive into the research topics of 'DISCO+QR: rooting species trees in the presence of GDL and ILS'. Together they form a unique fingerprint.Datasets
-
Data from: DISCO+QR: Rooting Species Trees in the Presence of GDL and ILS
Willson, J. (Creator), Tabatabaee, Y. (Creator), Liu, B. (Creator) & Warnow, T. (Creator), University of Illinois Urbana-Champaign, Feb 7 2023
DOI: 10.13012/B2IDB-5748609_V1
Dataset