Abstract
Gene tree correction aims to improve the accuracy of a gene tree by using computational techniques along with a reference tree (and in some cases available sequence data). It is an active area of research when dealing with gene tree heterogeneity due to duplication and loss (GDL). Here, we study the problem of gene tree correction where gene tree heterogeneity is instead due to incomplete lineage sorting (ILS, a common problem in eukaryotic phylogenetics) and horizontal gene transfer (HGT, a common problem in bacterial phylogenetics). We introduce TRACTION, a simple polynomial time method that provably finds an optimal solution to the RF-Optimal Tree Refinement and Completion Problem, which seeks a refinement and completion of an input tree t with respect to a given binary tree T so as to minimize the Robinson-Foulds (RF) distance. We present the results of an extensive simulation study evaluating TRACTION within gene tree correction pipelines on 68,000 estimated gene trees, using estimated species trees as reference trees. We explore accuracy under conditions with varying levels of gene tree heterogeneity due to ILS and HGT. We show that TRACTION matches or improves the accuracy of well-established methods from the GDL literature under conditions with HGT and ILS, and ties for best under the ILS-only conditions. Furthermore, TRACTION ties for fastest on these datasets. TRACTION is available at https://github.com/pranjalv123/TRACTION-RF and the study datasets are available at https://doi.org/10.13012/B2IDB-1747658_V1.
Original language | English (US) |
---|---|
Title of host publication | 19th International Workshop on Algorithms in Bioinformatics, WABI 2019 |
Editors | Katharina T. Huber, Dan Gusfield |
Publisher | Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing |
ISBN (Electronic) | 9783959771238 |
DOIs | |
State | Published - Sep 2019 |
Event | 19th International Workshop on Algorithms in Bioinformatics, WABI 2019 - Niagara Falls, United States Duration: Sep 8 2019 → Sep 10 2019 |
Publication series
Name | Leibniz International Proceedings in Informatics, LIPIcs |
---|---|
Volume | 143 |
ISSN (Print) | 1868-8969 |
Conference
Conference | 19th International Workshop on Algorithms in Bioinformatics, WABI 2019 |
---|---|
Country/Territory | United States |
City | Niagara Falls |
Period | 9/8/19 → 9/10/19 |
Keywords
- Gene tree correction
- Horizontal gene transfer
- Incomplete lineage sorting
ASJC Scopus subject areas
- Software
Fingerprint
Dive into the research topics of 'Traction: Fast non-parametric improvement of estimated gene trees'. Together they form a unique fingerprint.Datasets
-
Data from TRACTION: Fast non-parametric improvement of estimated gene trees
Christensen, S. (Creator), Molloy, E. K. (Creator), Vachaspati, P. (Creator) & Warnow, T. (Creator), University of Illinois Urbana-Champaign, Jul 29 2019
DOI: 10.13012/B2IDB-1747658_V1
Dataset