Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly

T. I. Garcia, Y. Shen, J. Catchen, A. Amores, M. Schartl, J. Postlethwait, R. B. Walter

Research output: Contribution to journalArticlepeer-review

Abstract

For many researchers, next generation sequencing data holds the key to answering a category of questions previously unassailable. One of the important and challenging steps in achieving these goals is accurately assembling the massive quantity of short sequencing reads into full nucleic acid sequences. For research groups working with non-model or wild systems, short read assembly can pose a significant challenge due to the lack of pre-existing EST or genome reference libraries. While many publications describe the overall process of sequencing and assembly, few address the topic of how many and what types of reads are best for assembly. The goal of this project was use real world data to explore the effects of read quantity and short read quality scores on the resulting de novo assemblies. Using several samples of short reads of various sizes and qualities we produced many assemblies in an automated manner. We observe how the properties of read length, read quality, and read quantity affect the resulting assemblies and provide some general recommendations based on our real-world data set.

Original languageEnglish (US)
Pages (from-to)95-101
Number of pages7
JournalComparative Biochemistry and Physiology - C Toxicology and Pharmacology
Volume155
Issue number1
DOIs
StatePublished - Jan 2012
Externally publishedYes

Keywords

  • Assembly
  • NGS
  • Phred
  • Quality
  • Quantity
  • Short read
  • Velvet

ASJC Scopus subject areas

  • Biochemistry
  • Physiology
  • Toxicology
  • Cell Biology
  • Health, Toxicology and Mutagenesis

Fingerprint Dive into the research topics of 'Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly'. Together they form a unique fingerprint.

Cite this