Abstract
The Synthetic Biology Knowledge System (SBKS) is an instance of the SynBioHub repository that includes text and data information that has been mined from papers published in ACS Synthetic Biology. This paper describes the SBKS curation framework that is being developed to construct the knowledge stored in this repository. The text mining pipeline performs automatic annotation of the articles using natural language processing techniques to identify salient content such as key terms, relationships between terms, and main topics. The data mining pipeline performs automatic annotation of the sequences extracted from the supplemental documents with the genetic parts used in them. Together these two pipelines link genetic parts to papers describing the context in which they are used. Ultimately, SBKS will reduce the time necessary for synthetic biologists to find the information necessary to complete their designs.
Original language | English (US) |
---|---|
Pages (from-to) | 2276-2285 |
Number of pages | 10 |
Journal | ACS synthetic biology |
Volume | 10 |
Issue number | 9 |
DOIs | |
State | Accepted/In press - 2021 |
Keywords
- data mining
- SBOL
- sequence annotation
- SynBioHub
- text mining
- topic modeling
ASJC Scopus subject areas
- Biomedical Engineering
- Biochemistry, Genetics and Molecular Biology (miscellaneous)