TY - JOUR
T1 - TIGER
T2 - tiled iterative genome assembler.
AU - Wu, Xiao Long
AU - Heo, Yun
AU - El Hajj, Izzat
AU - Hwu, Wen Mei
AU - Chen, Deming
AU - Ma, Jian
N1 - Funding Information:
We thank the National Center for Supercomputing Applications at the University of Illinois for providing computational resources. We also thank anonymous reviewers for helpful suggestions. This work was supported by a Samsung Fellowship (YH), NSF CAREER award #1054309 (JM), NIH/NHGRI grant 1R21HG006464 (JM), and an In3 award from the University of Illinois (JM, DC, WMH). This article has been published as part of BMC Bioinformatics Volume 13 Supplement 19, 2012: Proceedings of the Tenth Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Comparative Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/ 13/S19.
PY - 2012
Y1 - 2012
N2 - With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers.
AB - With the cost reduction of the next-generation sequencing (NGS) technologies, genomics has provided us with an unprecedented opportunity to understand fundamental questions in biology and elucidate human diseases. De novo genome assembly is one of the most important steps to reconstruct the sequenced genome. However, most de novo assemblers require enormous amount of computational resource, which is not accessible for most research groups and medical personnel. We have developed a novel de novo assembly framework, called Tiger, which adapts to available computing resources by iteratively decomposing the assembly problem into sub-problems. Our method is also flexible to embed different assemblers for various types of target genomes. Using the sequence data from a human chromosome, our results show that Tiger can achieve much better NG50s, better genome coverage, and slightly higher errors, as compared to Velvet and SOAPdenovo, using modest amount of memory that are available in commodity computers today. Most state-of-the-art assemblers that can achieve relatively high assembly quality need excessive amount of computing resource (in particular, memory) that is not available to most researchers to achieve high quality results. Tiger provides the only known viable path to utilize NGS de novo assemblers that require more memory than that is present in available computers. Evaluation results demonstrate the feasibility of getting better quality results with low memory footprint and the scalability of using distributed commodity computers.
UR - http://www.scopus.com/inward/record.url?scp=84878082882&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84878082882&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-13-S19-S18
DO - 10.1186/1471-2105-13-S19-S18
M3 - Article
C2 - 23281792
AN - SCOPUS:84878082882
SN - 1001-0742
VL - 13 Suppl 19
SP - S18
JO - [No source information available]
JF - [No source information available]
ER -