Barking up the wrong treelength: The impact of gap penalty on alignment and tree accuracy

Kevin Liu, Serita Nelesen, Sindhu Raghavan, C. Randal Linder, Tandy Warnow

Research output: Contribution to journalArticle

Abstract

Several methods have been developed for simultaneous estimation of alignment and tree, of which POY is the most popular. In a 2007 paper published in Systematic Biology, Ogden and Rosenberg reported on a simulation study in which they compared POY to estimating the alignment using ClustalW and then analyzing the resultant alignment using maximum parsimony. They found that ClustalW+MP outperformed POY with respect to alignment and phylogenetic tree accuracy, and they concluded that simultaneous estimation techniques are not competitive with two-phase techniques. Our paper presents a simulation study in which we focus on the NP-hard optimization problem that POY addresses: minimizing treelength. Our study considers the impact of the gap penalty and suggests that the poor performance observed for POY by Ogden and Rosenberg is due to the simple gap penalties they used to score alignment/tree pairs. Our study suggests that optimizing under an affine gap penalty might produce alignments that are better than ClustalW alignments, and competitive with those produced by the best current alignment methods. We also show that optimizing under this affine gap penalty produces trees whose topological accuracy is better than ClustalW+MP, and competitive with the current best two-phase methods.

Original languageEnglish (US)
Article number4547425
Pages (from-to)7-21
Number of pages15
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume6
Issue number1
DOIs
StatePublished - Jan 1 2009

    Fingerprint

Keywords

  • Biology and genetics
  • Markov processes

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics
  • Medicine(all)

Cite this