Gene tree correction aims to improve the accuracy of a gene tree by using computational techniques along with a reference tree (and in some cases available sequence data). It is an active area of research when dealing with gene tree heterogeneity due to duplication and loss (GDL). Here, we study the problem of gene tree correction where gene tree heterogeneity is instead due to incomplete lineage sorting (ILS, a common problem in eukaryotic phylogenetics) and horizontal gene transfer (HGT, a common problem in bacterial phylogenetics). We introduce TRACTION, a simple polynomial time method that provably finds an optimal solution to the RF-Optimal Tree Refinement and Completion Problem, which seeks a refinement and completion of an input tree t with respect to a given binary tree T so as to minimize the Robinson-Foulds (RF) distance. We present the results of an extensive simulation study evaluating TRACTION within gene tree correction pipelines on 68,000 estimated gene trees, using estimated species trees as reference trees. We explore accuracy under conditions with varying levels of gene tree heterogeneity due to ILS and HGT. We show that TRACTION matches or improves the accuracy of well-established methods from the GDL literature under conditions with HGT and ILS, and ties for best under the ILS-only conditions. Furthermore, TRACTION ties for fastest on these datasets. TRACTION is available at https://github.com/pranjalv123/TRACTION-RF and the study datasets are available at https://doi.org/10.13012/B2IDB-1747658_V1.

Original languageEnglish (US)
Title of host publication19th International Workshop on Algorithms in Bioinformatics, WABI 2019
EditorsKatharina T. Huber, Dan Gusfield
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959771238
StatePublished - Sep 2019
Event19th International Workshop on Algorithms in Bioinformatics, WABI 2019 - Niagara Falls, United States
Duration: Sep 8 2019Sep 10 2019

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
ISSN (Print)1868-8969


Conference19th International Workshop on Algorithms in Bioinformatics, WABI 2019
Country/TerritoryUnited States
CityNiagara Falls


  • Gene tree correction
  • Horizontal gene transfer
  • Incomplete lineage sorting

ASJC Scopus subject areas

  • Software


Dive into the research topics of 'Traction: Fast non-parametric improvement of estimated gene trees'. Together they form a unique fingerprint.

Cite this