Open Information Extraction with Global Structure Constraints

Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Frank F. Xu, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Extracting entities and their relations from text is an important task for understanding massive text corpora. Open information extraction (IE) systems mine relation tuples (i.e., entity arguments and a predicate string to describe their relation) from sentences. However, current open IE systems ignore the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions. In this paper, we propose a novel open IE system, called ReMine, which integrates local context signal and global structural signal in a unified framework with distant supervision. The new system can be efficiently applied to different domains as it uses facts from external knowledge bases as supervision; and can effectively score sentence-level tuple extractions based on corpus-level statistics. Specifically, we design a joint optimization problem to unify (1) segmenting entity/relation phrases in individual sentences based on local context; and (2) measuring the quality of sentence-level extractions with a translating-based objective. Experiments on real-world corpora from different domains demonstrate the effectiveness and robustness of ReMine when compared to other open IE systems.

Original languageEnglish (US)
Title of host publicationThe Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018
PublisherAssociation for Computing Machinery
Number of pages2
ISBN (Electronic)9781450356404
StatePublished - Apr 23 2018
Event27th International World Wide Web, WWW 2018 - Lyon, France
Duration: Apr 23 2018Apr 27 2018

Publication series

NameThe Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018


Conference27th International World Wide Web, WWW 2018


  • open information extraction
  • weakly-supervised learning

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Software


Dive into the research topics of 'Open Information Extraction with Global Structure Constraints'. Together they form a unique fingerprint.

Cite this