TY - GEN
T1 - Corpus-based Open-Domain Event Type Induction
AU - Shen, Jiaming
AU - Zhang, Yunyi
AU - Ji, Heng
AU - Han, Jiawei
N1 - Funding Information:
Research was supported in part by US DARPA KAIROS Program No. FA8750-19-2-1004, So-cialSim Program No. W911NF-17-C-0099, and INCAS Program No. HR001121C0165, NSF IIS-19-56151, IIS-17-41317, and IIS 17-04532, and the Molecule Maker Lab Institute: An AI Research Institutes program supported by NSF under Award No. 2019897. Any opinions, findings, and conclusions or recommendations expressed herein are those of the authors and do not necessarily represent the views, either expressed or implied, of DARPA or the U.S. Government. We want to thank Martha Palmer and Ghazaleh Kazeminejad for the help on VerbNet and OntoNotes Sense Groupings. We also would like to thank Sha Li, Yu Meng, Lifu Huang for insightful discussions and anonymous reviewers for valuable feedback.
Publisher Copyright:
© 2021 Association for Computational Linguistics
PY - 2021
Y1 - 2021
N2 - Traditional event extraction methods require predefined event types and their corresponding annotations to learn event extractors. These prerequisites are often hard to be satisfied in real-world applications. This work presents a corpus-based open-domain event type induction method that automatically discovers a set of event types from a given corpus. As events of the same type could be expressed in multiple ways, we propose to represent each event type as a cluster of hpredicate sense, object headi pairs. Specifically, our method (1) selects salient predicates and object heads, (2) disambiguates predicate senses using only a verb sense dictionary, and (3) obtains event types by jointly embedding and clustering hpredicate sense, object headi pairs in a latent spherical space. Our experiments, on three datasets from different domains, show our method can discover salient and high-quality event types, according to both automatic and human evaluations.
AB - Traditional event extraction methods require predefined event types and their corresponding annotations to learn event extractors. These prerequisites are often hard to be satisfied in real-world applications. This work presents a corpus-based open-domain event type induction method that automatically discovers a set of event types from a given corpus. As events of the same type could be expressed in multiple ways, we propose to represent each event type as a cluster of hpredicate sense, object headi pairs. Specifically, our method (1) selects salient predicates and object heads, (2) disambiguates predicate senses using only a verb sense dictionary, and (3) obtains event types by jointly embedding and clustering hpredicate sense, object headi pairs in a latent spherical space. Our experiments, on three datasets from different domains, show our method can discover salient and high-quality event types, according to both automatic and human evaluations.
UR - http://www.scopus.com/inward/record.url?scp=85122719521&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85122719521&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85122719521
T3 - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
SP - 5427
EP - 5440
BT - EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
Y2 - 7 November 2021 through 11 November 2021
ER -