The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

Ross Overbeek, Robert Olson, Gordon D. Pusch, Gary J. Olsen, James J. Davis, Terry Disz, Robert A. Edwards, Svetlana Gerdes, Bruce Parrello, Maulik Shukla, Veronika Vonstein, Alice R. Wattam, Fangfang Xia, Rick Stevens

Research output: Contribution to journalArticle

Abstract

In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

Original languageEnglish (US)
Pages (from-to)D206-D214
JournalNucleic acids research
Volume42
Issue numberD1
DOIs
StatePublished - Jan 1 2014

Fingerprint

Microbial Genome
Genome
Technology
Databases
Proteins
Computational Biology
Genes

ASJC Scopus subject areas

  • Genetics

Cite this

Overbeek, R., Olson, R., Pusch, G. D., Olsen, G. J., Davis, J. J., Disz, T., ... Stevens, R. (2014). The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic acids research, 42(D1), D206-D214. https://doi.org/10.1093/nar/gkt1226

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). / Overbeek, Ross; Olson, Robert; Pusch, Gordon D.; Olsen, Gary J.; Davis, James J.; Disz, Terry; Edwards, Robert A.; Gerdes, Svetlana; Parrello, Bruce; Shukla, Maulik; Vonstein, Veronika; Wattam, Alice R.; Xia, Fangfang; Stevens, Rick.

In: Nucleic acids research, Vol. 42, No. D1, 01.01.2014, p. D206-D214.

Research output: Contribution to journalArticle

Overbeek, R, Olson, R, Pusch, GD, Olsen, GJ, Davis, JJ, Disz, T, Edwards, RA, Gerdes, S, Parrello, B, Shukla, M, Vonstein, V, Wattam, AR, Xia, F & Stevens, R 2014, 'The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)', Nucleic acids research, vol. 42, no. D1, pp. D206-D214. https://doi.org/10.1093/nar/gkt1226
Overbeek, Ross ; Olson, Robert ; Pusch, Gordon D. ; Olsen, Gary J. ; Davis, James J. ; Disz, Terry ; Edwards, Robert A. ; Gerdes, Svetlana ; Parrello, Bruce ; Shukla, Maulik ; Vonstein, Veronika ; Wattam, Alice R. ; Xia, Fangfang ; Stevens, Rick. / The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). In: Nucleic acids research. 2014 ; Vol. 42, No. D1. pp. D206-D214.
@article{ed44ee9cec2046f4ad5bf1e954fd4e71,
title = "The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)",
abstract = "In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.",
author = "Ross Overbeek and Robert Olson and Pusch, {Gordon D.} and Olsen, {Gary J.} and Davis, {James J.} and Terry Disz and Edwards, {Robert A.} and Svetlana Gerdes and Bruce Parrello and Maulik Shukla and Veronika Vonstein and Wattam, {Alice R.} and Fangfang Xia and Rick Stevens",
year = "2014",
month = "1",
day = "1",
doi = "10.1093/nar/gkt1226",
language = "English (US)",
volume = "42",
pages = "D206--D214",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "D1",

}

TY - JOUR

T1 - The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST)

AU - Overbeek, Ross

AU - Olson, Robert

AU - Pusch, Gordon D.

AU - Olsen, Gary J.

AU - Davis, James J.

AU - Disz, Terry

AU - Edwards, Robert A.

AU - Gerdes, Svetlana

AU - Parrello, Bruce

AU - Shukla, Maulik

AU - Vonstein, Veronika

AU - Wattam, Alice R.

AU - Xia, Fangfang

AU - Stevens, Rick

PY - 2014/1/1

Y1 - 2014/1/1

N2 - In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

AB - In 2004, the SEED (http://pubseed.theseed.org/) was created to provide consistent and accurate genome annotations across thousands of genomes and as a platform for discovering and developing de novo annotations. The SEED is a constantly updated integration of genomic data with a genome database, web front end, API and server scripts. It is used by many scientists for predicting gene functions and discovering new pathways. In addition to being a powerful database for bioinformatics research, the SEED also houses subsystems (collections of functionally related protein families) and their derived FIGfams (protein families), which represent the core of the RAST annotation engine (http://rast.nmpdr.org/). When a new genome is submitted to RAST, genes are called and their annotations are made by comparison to the FIGfam collection. If the genome is made public, it is then housed within the SEED and its proteins populate the FIGfam collection. This annotation cycle has proven to be a robust and scalable solution to the problem of annotating the exponentially increasing number of genomes. To date, >12 000 users worldwide have annotated >60 000 distinct genomes using RAST. Here we describe the interconnectedness of the SEED database and RAST, the RAST annotation pipeline and updates to both resources.

UR - http://www.scopus.com/inward/record.url?scp=84891804612&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84891804612&partnerID=8YFLogxK

U2 - 10.1093/nar/gkt1226

DO - 10.1093/nar/gkt1226

M3 - Article

C2 - 24293654

AN - SCOPUS:84891804612

VL - 42

SP - D206-D214

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - D1

ER -