Background: Scientific literature is a source of the most reliable and comprehensive knowledge about molecular interaction networks. Formalization of this knowledge is necessary for computational analysis and is achieved by automatic fact extraction using various text-mining algorithms. Most of these techniques suffer from high false positive rates and redundancy of the extracted information. The extracted facts form a large network with no pathways defined. Results: We describe the methodology for automatic curation of Biological Association Networks (BANs) derived by a natural language processing technology called Medscan. The curated data is used for automatic pathway reconstruction. The algorithm for the reconstruction of signaling pathways is also described and validated by comparison with manually curated pathways and tissue- specific gene expression profiles. Conclusion: Biological Association Networks extracted by MedScan technology contain sufficient information for constructing thousands of mammalian signaling pathways for multiple tissues. The automatically curated MedScan data is adequate for automatic generation of good quality signaling networks. The automatically generated Regulome pathways and manually curated pathways used for their validation are available free in the ResNetCore database from Ariadne Genomics, Inc. [I]. The pathways can be viewed and analyzed through the use of a free demo version of PathwayStudio software. The Medscan technology is also available for evaluation using the free demo version of PathwayStudio software.
ASJC Scopus subject areas
- Structural Biology
- Molecular Biology
- Computer Science Applications
- Applied Mathematics