Open Information Extraction with Meta-pattern Discovery in Biomedical Literature

Xuan Wang, Yu Zhang, Qi Li, Yinyin Chen, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Biomedical open information extraction (BioOpenIE) is a novel paradigm to automatically extract structured information from unstructured text with no or little supervision. It does not require any pre-specified relation types but aims to extract all the relation tuples from the corpus. A major challenge for open information extraction (OpenIE) is that it produces massive surface-name formed relation tuples that cannot be directly used for downstream applications. We propose a novel framework CPIE (Clause+Pattern-guided Information Extraction) that incorporates clause extraction and meta-pattern discovery to extract structured relation tuples with little supervision. Compared with previous OpenIE methods, CPIE produces massive but more structured output that can be directly used for downstream applications. We first detect short clauses from input sentences. Then we extract quality textual patterns and perform synonymous pattern grouping to identify relation types. Last, we obtain the corresponding relation tuples by matching each quality pattern in the text. Experiments show that CPIE achieves the highest precision in comparison with state-of-the-art OpenIE baselines, and also keeps the distinctiveness and simplicity of the extracted relation tuples. CPIE shows great potential in effectively dealing with real-world biomedical literature with complicated sentence structures and rich information.

Original languageEnglish (US)
Title of host publicationACM-BCB 2018 - Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery
Pages291-300
Number of pages10
ISBN (Electronic)9781450357944
DOIs
StatePublished - Aug 15 2018
Event9th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2018 - Washington, United States
Duration: Aug 29 2018Sep 1 2018

Publication series

NameACM-BCB 2018 - Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Other

Other9th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2018
Country/TerritoryUnited States
CityWashington
Period8/29/189/1/18

Keywords

  • Biomedical information extraction
  • Open information extraction
  • Pattern mining
  • Text mining

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Health Informatics
  • Biomedical Engineering

Fingerprint

Dive into the research topics of 'Open Information Extraction with Meta-pattern Discovery in Biomedical Literature'. Together they form a unique fingerprint.

Cite this