TY - GEN
T1 - MotifNetwork
T2 - 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
AU - Tilson, Jeffrey L.
AU - Rendon, Gloria
AU - Ger, Mao Feng
AU - Jakobsson, Eric
PY - 2007
Y1 - 2007
N2 - Traditionally, bioinformatics has been organized around the concepts of genes and gene products, typically proteins. Proteins are represented as sequences of amino acids and are analyzed against each other by alignment and similarity of their amino acids. However proteins contain subsequences that define their activity and mode of regulation. These subsequences are referred to as "domains" and "motifs". For understanding many aspects of gene function, gene interaction, and gene and organism evolution, there is an advantage to focusing analysis on the domain/motif level rather than on the gene level. Such analysis is inherently highly computationally intensive because of the exponential growth of the protein databases and the combinatorial number of ways in which domains and motifs interact with each other. Here we report, by means of a biological example, on our efforts to build a user-friendly environment for facilitating such analysis. The name of this environment is the MotifNetwork. The MotifNetwork is an integration effort to build a suite of biologically oriented and grid-enabled workflows for high throughput domain analysis of protein sequences. The workflow orchestration and enactment is handled with Taverna. [Oinn, 2004] The supporting grid-enabling services used to wrap and invoke the computational applications are implemented with the Generic Service Toolkit (GST) [Kandaswamy, 2006]. The ultimate results of this environment are data products, organized as matrices, and visualization files suitable for quick analysis. Detailed descriptions of data products from a representative biological example are presented. Lastly, some preliminary performance data are displayed including use of the workflow to determine the domain architecture of all proteins in a complete genome (the honeybee). Extension to comprehensive analysis of SNP's in a genome is discussed. The MotifNetwork workflow is or will soon be available online through the RENCI Science Gateway at http://www.tgbioportal.org/.
AB - Traditionally, bioinformatics has been organized around the concepts of genes and gene products, typically proteins. Proteins are represented as sequences of amino acids and are analyzed against each other by alignment and similarity of their amino acids. However proteins contain subsequences that define their activity and mode of regulation. These subsequences are referred to as "domains" and "motifs". For understanding many aspects of gene function, gene interaction, and gene and organism evolution, there is an advantage to focusing analysis on the domain/motif level rather than on the gene level. Such analysis is inherently highly computationally intensive because of the exponential growth of the protein databases and the combinatorial number of ways in which domains and motifs interact with each other. Here we report, by means of a biological example, on our efforts to build a user-friendly environment for facilitating such analysis. The name of this environment is the MotifNetwork. The MotifNetwork is an integration effort to build a suite of biologically oriented and grid-enabled workflows for high throughput domain analysis of protein sequences. The workflow orchestration and enactment is handled with Taverna. [Oinn, 2004] The supporting grid-enabling services used to wrap and invoke the computational applications are implemented with the Generic Service Toolkit (GST) [Kandaswamy, 2006]. The ultimate results of this environment are data products, organized as matrices, and visualization files suitable for quick analysis. Detailed descriptions of data products from a representative biological example are presented. Lastly, some preliminary performance data are displayed including use of the workflow to determine the domain architecture of all proteins in a complete genome (the honeybee). Extension to comprehensive analysis of SNP's in a genome is discussed. The MotifNetwork workflow is or will soon be available online through the RENCI Science Gateway at http://www.tgbioportal.org/.
UR - http://www.scopus.com/inward/record.url?scp=47649107467&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=47649107467&partnerID=8YFLogxK
U2 - 10.1109/BIBE.2007.4375625
DO - 10.1109/BIBE.2007.4375625
M3 - Conference contribution
AN - SCOPUS:47649107467
SN - 1424415098
SN - 9781424415090
T3 - Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
SP - 620
EP - 627
BT - Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE
Y2 - 14 January 2007 through 17 January 2007
ER -