TIGER: tiled iterative genome assembler.

Xiao Long Wu, Yun Heo, Izzat El Hajj, Wen Mei Hwu, Deming Chen, Jian Ma

Research output: Contribution to journalArticle

Abstract

With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers.

Original languageEnglish (US)
JournalUnknown Journal
Volume13 Suppl 19
DOIs
StatePublished - 2012
Externally publishedYes

Fingerprint

Tigers
Genome
Genes
Data storage equipment
Sequencing
Resources
Computing
Human Chromosomes
Chromosomes
Genomics
Cost reduction
Chromosome
Biology
Scalability
Biomedical Research
Coverage
Research Personnel
Personnel
Technology
Costs and Cost Analysis

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics

Cite this

TIGER : tiled iterative genome assembler. / Wu, Xiao Long; Heo, Yun; El Hajj, Izzat; Hwu, Wen Mei; Chen, Deming; Ma, Jian.

In: Unknown Journal, Vol. 13 Suppl 19, 2012.

Research output: Contribution to journalArticle

Wu, Xiao Long ; Heo, Yun ; El Hajj, Izzat ; Hwu, Wen Mei ; Chen, Deming ; Ma, Jian. / TIGER : tiled iterative genome assembler. In: Unknown Journal. 2012 ; Vol. 13 Suppl 19.
@article{1b566222789d455aae1cdb984150abc0,
title = "TIGER: tiled iterative genome assembler.",
abstract = "With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers.",
author = "Wu, {Xiao Long} and Yun Heo and {El Hajj}, Izzat and Hwu, {Wen Mei} and Deming Chen and Jian Ma",
year = "2012",
doi = "10.1186/1471-2105-13-S19-S18",
language = "English (US)",
volume = "13 Suppl 19",
journal = "[No source information available]",
issn = "1001-0742",
publisher = "Chinese Academy of Sciences",

}

TY - JOUR

T1 - TIGER

T2 - tiled iterative genome assembler.

AU - Wu, Xiao Long

AU - Heo, Yun

AU - El Hajj, Izzat

AU - Hwu, Wen Mei

AU - Chen, Deming

AU - Ma, Jian

PY - 2012

Y1 - 2012

N2 - With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers.

AB - With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers.

UR - http://www.scopus.com/inward/record.url?scp=84878082882&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878082882&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-13-S19-S18

DO - 10.1186/1471-2105-13-S19-S18

M3 - Article

C2 - 23281792

AN - SCOPUS:84878082882

VL - 13 Suppl 19

JO - [No source information available]

JF - [No source information available]

SN - 1001-0742

ER -