Domain-independent novel event discovery and semi-automatic event annotation

Hao Li, Xiang Li, Heng Ji, Yuval Marton

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Information Extraction (IE) is becoming increasingly useful, but it is a costly task to discover and annotate novel events, event arguments, and event types. We exploit both monolingual texts and bilingual sentence-aligned parallel texts to cluster event triggers and discover novel event types. We then generate event argument annotations semiautomatically, framed as a sentence ranking and semantic role labeling task. Experiments on three different corpora -- ACE, OntoNotes and a collection of scientific literature -- have demonstrated that our domain-independent methods can significantly speed up the entire event discovery and annotation process while maintaining high quality.

Original languageEnglish (US)
Title of host publicationPACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation
Pages233-242
Number of pages10
StatePublished - Dec 1 2010
Externally publishedYes
Event24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24 - Sendai, Japan
Duration: Nov 4 2010Nov 7 2010

Publication series

NamePACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

Other

Other24th Pacific Asia Conference on Language, Information and Computation, PACLIC 24
CountryJapan
CitySendai
Period11/4/1011/7/10

    Fingerprint

Keywords

  • Domain-independent
  • Information extraction
  • Novel event discovery
  • Semantic role labeling

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science (miscellaneous)

Cite this

Li, H., Li, X., Ji, H., & Marton, Y. (2010). Domain-independent novel event discovery and semi-automatic event annotation. In PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation (pp. 233-242). (PACLIC 24 - Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation).