TY - GEN
T1 - Weakly-supervised relation extraction by pattern-enhanced embedding learning
AU - Qu, Meng
AU - Ren, Xiang
AU - Zhang, Yu
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License.
PY - 2018/4/10
Y1 - 2018/4/10
N2 - Extracting relations from text corpora is an important task with wide applications. However, it becomes particularly challenging when focusing on weakly-supervised relation extraction, that is, utilizing a few relation instances (i.e., a pair of entities and their relation) as seeds to extract from corpora more instances of the same relation. Existing distributional approaches leverage the corpus-level co-occurrence statistics of entities to predict their relations, and require a large number of labeled instances to learn effective relation classifiers. Alternatively, pattern-based approaches perform boostrapping or apply neural networks to model the local contexts, but still rely on a large number of labeled instances to build reliable models. In this paper, we study the integration of distributional and pattern-based methods in a weakly-supervised setting such that the two kinds of methods can provide complementary supervision for each other to build an effective, unified model. We propose a novel co-training framework with a distributional module and a pattern module. During training, the distributional module helps the pattern module discriminate between the informative patterns and other patterns, and the pattern module generates some highly-confident instances to improve the distributional module. The whole framework can be effectively optimized by iterating between improving the pattern module and updating the distributional module. We conduct experiments on two tasks: knowledge base completion with text corpora and corpus-level relation extraction. Experimental results prove the effectiveness of our framework over many competitive baselines.
AB - Extracting relations from text corpora is an important task with wide applications. However, it becomes particularly challenging when focusing on weakly-supervised relation extraction, that is, utilizing a few relation instances (i.e., a pair of entities and their relation) as seeds to extract from corpora more instances of the same relation. Existing distributional approaches leverage the corpus-level co-occurrence statistics of entities to predict their relations, and require a large number of labeled instances to learn effective relation classifiers. Alternatively, pattern-based approaches perform boostrapping or apply neural networks to model the local contexts, but still rely on a large number of labeled instances to build reliable models. In this paper, we study the integration of distributional and pattern-based methods in a weakly-supervised setting such that the two kinds of methods can provide complementary supervision for each other to build an effective, unified model. We propose a novel co-training framework with a distributional module and a pattern module. During training, the distributional module helps the pattern module discriminate between the informative patterns and other patterns, and the pattern module generates some highly-confident instances to improve the distributional module. The whole framework can be effectively optimized by iterating between improving the pattern module and updating the distributional module. We conduct experiments on two tasks: knowledge base completion with text corpora and corpus-level relation extraction. Experimental results prove the effectiveness of our framework over many competitive baselines.
KW - Co-training
KW - Relation extraction
UR - http://www.scopus.com/inward/record.url?scp=85061362291&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061362291&partnerID=8YFLogxK
U2 - 10.1145/3178876.3186024
DO - 10.1145/3178876.3186024
M3 - Conference contribution
AN - SCOPUS:85061362291
T3 - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
SP - 1257
EP - 1266
BT - The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018
PB - Association for Computing Machinery
T2 - 27th International World Wide Web, WWW 2018
Y2 - 23 April 2018 through 27 April 2018
ER -