Abstract

The Maximum Weight Trace (MWT) is an optimization problem for multiple sequence alignment that takes a set of sequences and weights on pairs of letters from different sequences and seeks a multiple sequence alignment that maximizes the sum of the weights for the pairs of letters that appear in the same column. MWT was introduced by Kececioglu in 1993, then proven to be NP-hard, and heuristics and exact solutions for MWT developed. Unfortunately none of the MWT methods are scalable to even moderate-sized datasets. Here we propose the MWT-AM problem (MWT for Alignment Merging), an extension of the MWT problem to be used in a divide-and-conquer setting, where we seek a merged alignment of a set of disjoint alignments that optimizes the MWT score. We present variations of GCM (the Graph Clustering Merger, originally developed for the MAGUS multiple sequence alignment method) that are specifically designed for MWT-AM. We show that the best of these variants, which we refer to as GCM-MWT, perform well for the MWT-AM criterion. We explore GCM-MWT in comparison to other methods for merging alignments, T-coffee and MAFFT–merge, and find that GCM-MWT produces more accurate merged alignments. GCM-MWT is available in open source form at https://github.com/vlasmirnov/MAGUS.

Original languageEnglish (US)
Title of host publicationAlgorithms for Computational Biology - 8th International Conference, AlCoB 2021, Proceedings
EditorsCarlos Martín-Vide, Miguel A. Vega-Rodríguez, Travis Wheeler
PublisherSpringer
Pages159-171
Number of pages13
ISBN (Print)9783030744311
DOIs
StatePublished - 2021
Event8th International Conference on Algorithms for Computational Biology, AlCoB 2021 - Missoula, United States
Duration: Jun 7 2021Jun 11 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12715 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th International Conference on Algorithms for Computational Biology, AlCoB 2021
Country/TerritoryUnited States
CityMissoula
Period6/7/216/11/21

Keywords

  • Clustering
  • Maximum Weight Trace
  • Multiple sequence alignment

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint

Dive into the research topics of 'The Maximum Weight Trace Alignment Merging Problem'. Together they form a unique fingerprint.

Cite this