Skip to main navigation Skip to search Skip to main content

Indexing strategies for rapid searches of short words in genome sequences

  • Christian Iseli
  • , Giovanna Ambrosini
  • , Philipp Bucher
  • , C. Victor Jongeneel

Research output: Contribution to journalArticlepeer-review

Abstract

Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limitted in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.

Original languageEnglish (US)
Article numbere579
JournalPloS one
Volume2
Issue number6
Early online dateJun 27 2007
DOIs
StatePublished - Jun 27 2007
Externally publishedYes

ASJC Scopus subject areas

  • General

Fingerprint

Dive into the research topics of 'Indexing strategies for rapid searches of short words in genome sequences'. Together they form a unique fingerprint.

Cite this