TY - GEN
T1 - Open Information Extraction with Global Structure Constraints
AU - Zhu, Qi
AU - Ren, Xiang
AU - Shang, Jingbo
AU - Zhang, Yu
AU - Xu, Frank F.
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License.
PY - 2018/4/23
Y1 - 2018/4/23
N2 - Extracting entities and their relations from text is an important task for understanding massive text corpora. Open information extraction (IE) systems mine relation tuples (i.e., entity arguments and a predicate string to describe their relation) from sentences. However, current open IE systems ignore the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions. In this paper, we propose a novel open IE system, called ReMine, which integrates local context signal and global structural signal in a unified framework with distant supervision. The new system can be efficiently applied to different domains as it uses facts from external knowledge bases as supervision; and can effectively score sentence-level tuple extractions based on corpus-level statistics. Specifically, we design a joint optimization problem to unify (1) segmenting entity/relation phrases in individual sentences based on local context; and (2) measuring the quality of sentence-level extractions with a translating-based objective. Experiments on real-world corpora from different domains demonstrate the effectiveness and robustness of ReMine when compared to other open IE systems.
AB - Extracting entities and their relations from text is an important task for understanding massive text corpora. Open information extraction (IE) systems mine relation tuples (i.e., entity arguments and a predicate string to describe their relation) from sentences. However, current open IE systems ignore the fact that global statistics in a large corpus can be collectively leveraged to identify high-quality sentence-level extractions. In this paper, we propose a novel open IE system, called ReMine, which integrates local context signal and global structural signal in a unified framework with distant supervision. The new system can be efficiently applied to different domains as it uses facts from external knowledge bases as supervision; and can effectively score sentence-level tuple extractions based on corpus-level statistics. Specifically, we design a joint optimization problem to unify (1) segmenting entity/relation phrases in individual sentences based on local context; and (2) measuring the quality of sentence-level extractions with a translating-based objective. Experiments on real-world corpora from different domains demonstrate the effectiveness and robustness of ReMine when compared to other open IE systems.
KW - open information extraction
KW - weakly-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85056097680&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85056097680&partnerID=8YFLogxK
U2 - 10.1145/3184558.3186927
DO - 10.1145/3184558.3186927
M3 - Conference contribution
AN - SCOPUS:85056097680
T3 - The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018
SP - 57
EP - 58
BT - The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018
PB - Association for Computing Machinery
T2 - 27th International World Wide Web, WWW 2018
Y2 - 23 April 2018 through 27 April 2018
ER -