TY - GEN
T1 - Bermuda
T2 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, BCB 2015
AU - Tang, Qingming
AU - Wang, Sheng
AU - Peng, Jian
AU - Ma, Jianzhu
AU - Xu, Jinbo
N1 - Publisher Copyright:
Copyright 2015 ACM.
PY - 2015/9/9
Y1 - 2015/9/9
N2 - Motivation: RNA-seq has made feasible the analysis of a whole set of expressed mRNAs. Mapping-based assembly of RNA-seq reads sometimes is infeasible due to lack of highquality references. However, de novo assembly is very challenging due to uneven expression levels among transcripts and also the read coverage variation within a single transcript. Existing methods either apply de Bruijn graphs of single-sized k-mers to assemble the full set of transcripts, or conduct multiple runs of assembly, but still apply graphs of single-sized k-mers at each run. However, a single k-mer size is not suitable for all the regions of the transcripts with varied coverage. Contribution: This paper presents a de novo assembler Bermuda with new insights for handling uneven coverage. Opposed to existing methods that use a single k-mer size for all the transcripts in each run of assembly, Bermuda self-adaptively uses a few k-mer sizes to assemble difierent regions of a single transcript according to their local coverage. As such, Bermuda can deal with uneven expression levels and coverage not only among transcripts, but also within a single transcript. Extensive tests show that Bermuda outperforms popular de novo assemblers in reconstructing unevenly-expressed transcripts with longer length, better contiguity and lower redundancy. Further, Bermuda is computationally efficient with moderate memory consumption. Availability: Supplementary materials are available through http://ttic.uchicago.edu/~qmtang/.
AB - Motivation: RNA-seq has made feasible the analysis of a whole set of expressed mRNAs. Mapping-based assembly of RNA-seq reads sometimes is infeasible due to lack of highquality references. However, de novo assembly is very challenging due to uneven expression levels among transcripts and also the read coverage variation within a single transcript. Existing methods either apply de Bruijn graphs of single-sized k-mers to assemble the full set of transcripts, or conduct multiple runs of assembly, but still apply graphs of single-sized k-mers at each run. However, a single k-mer size is not suitable for all the regions of the transcripts with varied coverage. Contribution: This paper presents a de novo assembler Bermuda with new insights for handling uneven coverage. Opposed to existing methods that use a single k-mer size for all the transcripts in each run of assembly, Bermuda self-adaptively uses a few k-mer sizes to assemble difierent regions of a single transcript according to their local coverage. As such, Bermuda can deal with uneven expression levels and coverage not only among transcripts, but also within a single transcript. Extensive tests show that Bermuda outperforms popular de novo assemblers in reconstructing unevenly-expressed transcripts with longer length, better contiguity and lower redundancy. Further, Bermuda is computationally efficient with moderate memory consumption. Availability: Supplementary materials are available through http://ttic.uchicago.edu/~qmtang/.
KW - De novo assembly
KW - Multiple k-mer
KW - RNA-Seq
KW - Uneven coverage
UR - http://www.scopus.com/inward/record.url?scp=84963577300&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963577300&partnerID=8YFLogxK
U2 - 10.1145/2808719.2808736
DO - 10.1145/2808719.2808736
M3 - Conference contribution
AN - SCOPUS:84963577300
T3 - BCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
SP - 166
EP - 175
BT - BCB 2015 - 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PB - Association for Computing Machinery, Inc
Y2 - 9 September 2015 through 12 September 2015
ER -