Computational analysis of bacterial RNA-Seq data

Ryan McClure, Divya Balasubramanian, Yan Sun, Maksym Bobrovskyy, Paul Sumby, Caroline A. Genco, Carin K. Vanderpool, Brian Tjaden

Research output: Contribution to journalArticle

Abstract

Recent advances in high-throughput RNA sequencing (RNA-seq) have enabled tremendous leaps forward in our understanding of bacterial transcriptomes. However, computational methods for analysis of bacterial transcriptome data have not kept pace with the large and growing data sets generated by RNA-seq technology. Here, we present new algorithms, specific to bacterial gene structures and transcriptomes, for analysis of RNA-seq data. The algorithms are implemented in an open source software system called Rockhopper that supports various stages of bacterial RNA-seq data analysis, including aligning sequencing reads to a genome, constructing transcriptome maps, quantifying transcript abundance, testing for differential gene expression, determining operon structures and visualizing results. We demonstrate the performance of Rockhopper using 2.1 billion sequenced reads from 75 RNA-seq experiments conducted with Escherichia coli, Neisseria gonorrhoeae, Salmonella enterica, Streptococcus pyogenes and Xenorhabdus nematophila. We find that the transcriptome maps generated by our algorithms are highly accurate when compared with focused experimental data from E. coli and N. gonorrhoeae, and we validate our system's ability to identify novel small RNAs, operons and transcription start sites. Our results suggest that Rockhopper can be used for efficient and accurate analysis of bacterial RNA-seq data, and that it can aid with elucidation of bacterial transcriptomes.

Original languageEnglish (US)
Pages (from-to)e140
JournalNucleic acids research
Volume41
Issue number14
DOIs
StatePublished - Aug 1 2013

Fingerprint

Bacterial RNA
RNA Sequence Analysis
Transcriptome
Neisseria gonorrhoeae
Gene Expression Profiling
Operon
Xenorhabdus
Escherichia coli
Bacterial Structures
Bacterial Genes
High-Throughput Nucleotide Sequencing
Salmonella enterica
Streptococcus pyogenes
Transcription Initiation Site
Software
Genome
RNA
Technology
Gene Expression

ASJC Scopus subject areas

  • Genetics

Cite this

McClure, R., Balasubramanian, D., Sun, Y., Bobrovskyy, M., Sumby, P., Genco, C. A., ... Tjaden, B. (2013). Computational analysis of bacterial RNA-Seq data. Nucleic acids research, 41(14), e140. https://doi.org/10.1093/nar/gkt444

Computational analysis of bacterial RNA-Seq data. / McClure, Ryan; Balasubramanian, Divya; Sun, Yan; Bobrovskyy, Maksym; Sumby, Paul; Genco, Caroline A.; Vanderpool, Carin K.; Tjaden, Brian.

In: Nucleic acids research, Vol. 41, No. 14, 01.08.2013, p. e140.

Research output: Contribution to journalArticle

McClure, R, Balasubramanian, D, Sun, Y, Bobrovskyy, M, Sumby, P, Genco, CA, Vanderpool, CK & Tjaden, B 2013, 'Computational analysis of bacterial RNA-Seq data', Nucleic acids research, vol. 41, no. 14, pp. e140. https://doi.org/10.1093/nar/gkt444
McClure R, Balasubramanian D, Sun Y, Bobrovskyy M, Sumby P, Genco CA et al. Computational analysis of bacterial RNA-Seq data. Nucleic acids research. 2013 Aug 1;41(14):e140. https://doi.org/10.1093/nar/gkt444
McClure, Ryan ; Balasubramanian, Divya ; Sun, Yan ; Bobrovskyy, Maksym ; Sumby, Paul ; Genco, Caroline A. ; Vanderpool, Carin K. ; Tjaden, Brian. / Computational analysis of bacterial RNA-Seq data. In: Nucleic acids research. 2013 ; Vol. 41, No. 14. pp. e140.
@article{b8508c8960c64ca7b2e66420dd0c5a4b,
title = "Computational analysis of bacterial RNA-Seq data",
abstract = "Recent advances in high-throughput RNA sequencing (RNA-seq) have enabled tremendous leaps forward in our understanding of bacterial transcriptomes. However, computational methods for analysis of bacterial transcriptome data have not kept pace with the large and growing data sets generated by RNA-seq technology. Here, we present new algorithms, specific to bacterial gene structures and transcriptomes, for analysis of RNA-seq data. The algorithms are implemented in an open source software system called Rockhopper that supports various stages of bacterial RNA-seq data analysis, including aligning sequencing reads to a genome, constructing transcriptome maps, quantifying transcript abundance, testing for differential gene expression, determining operon structures and visualizing results. We demonstrate the performance of Rockhopper using 2.1 billion sequenced reads from 75 RNA-seq experiments conducted with Escherichia coli, Neisseria gonorrhoeae, Salmonella enterica, Streptococcus pyogenes and Xenorhabdus nematophila. We find that the transcriptome maps generated by our algorithms are highly accurate when compared with focused experimental data from E. coli and N. gonorrhoeae, and we validate our system's ability to identify novel small RNAs, operons and transcription start sites. Our results suggest that Rockhopper can be used for efficient and accurate analysis of bacterial RNA-seq data, and that it can aid with elucidation of bacterial transcriptomes.",
author = "Ryan McClure and Divya Balasubramanian and Yan Sun and Maksym Bobrovskyy and Paul Sumby and Genco, {Caroline A.} and Vanderpool, {Carin K.} and Brian Tjaden",
year = "2013",
month = "8",
day = "1",
doi = "10.1093/nar/gkt444",
language = "English (US)",
volume = "41",
pages = "e140",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "14",

}

TY - JOUR

T1 - Computational analysis of bacterial RNA-Seq data

AU - McClure, Ryan

AU - Balasubramanian, Divya

AU - Sun, Yan

AU - Bobrovskyy, Maksym

AU - Sumby, Paul

AU - Genco, Caroline A.

AU - Vanderpool, Carin K.

AU - Tjaden, Brian

PY - 2013/8/1

Y1 - 2013/8/1

N2 - Recent advances in high-throughput RNA sequencing (RNA-seq) have enabled tremendous leaps forward in our understanding of bacterial transcriptomes. However, computational methods for analysis of bacterial transcriptome data have not kept pace with the large and growing data sets generated by RNA-seq technology. Here, we present new algorithms, specific to bacterial gene structures and transcriptomes, for analysis of RNA-seq data. The algorithms are implemented in an open source software system called Rockhopper that supports various stages of bacterial RNA-seq data analysis, including aligning sequencing reads to a genome, constructing transcriptome maps, quantifying transcript abundance, testing for differential gene expression, determining operon structures and visualizing results. We demonstrate the performance of Rockhopper using 2.1 billion sequenced reads from 75 RNA-seq experiments conducted with Escherichia coli, Neisseria gonorrhoeae, Salmonella enterica, Streptococcus pyogenes and Xenorhabdus nematophila. We find that the transcriptome maps generated by our algorithms are highly accurate when compared with focused experimental data from E. coli and N. gonorrhoeae, and we validate our system's ability to identify novel small RNAs, operons and transcription start sites. Our results suggest that Rockhopper can be used for efficient and accurate analysis of bacterial RNA-seq data, and that it can aid with elucidation of bacterial transcriptomes.

AB - Recent advances in high-throughput RNA sequencing (RNA-seq) have enabled tremendous leaps forward in our understanding of bacterial transcriptomes. However, computational methods for analysis of bacterial transcriptome data have not kept pace with the large and growing data sets generated by RNA-seq technology. Here, we present new algorithms, specific to bacterial gene structures and transcriptomes, for analysis of RNA-seq data. The algorithms are implemented in an open source software system called Rockhopper that supports various stages of bacterial RNA-seq data analysis, including aligning sequencing reads to a genome, constructing transcriptome maps, quantifying transcript abundance, testing for differential gene expression, determining operon structures and visualizing results. We demonstrate the performance of Rockhopper using 2.1 billion sequenced reads from 75 RNA-seq experiments conducted with Escherichia coli, Neisseria gonorrhoeae, Salmonella enterica, Streptococcus pyogenes and Xenorhabdus nematophila. We find that the transcriptome maps generated by our algorithms are highly accurate when compared with focused experimental data from E. coli and N. gonorrhoeae, and we validate our system's ability to identify novel small RNAs, operons and transcription start sites. Our results suggest that Rockhopper can be used for efficient and accurate analysis of bacterial RNA-seq data, and that it can aid with elucidation of bacterial transcriptomes.

UR - http://www.scopus.com/inward/record.url?scp=84881504578&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84881504578&partnerID=8YFLogxK

U2 - 10.1093/nar/gkt444

DO - 10.1093/nar/gkt444

M3 - Article

C2 - 23716638

AN - SCOPUS:84881504578

VL - 41

SP - e140

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - 14

ER -