RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System

Haoyang Wen, Ying Lin, Tuan Manh Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi Ren Fung, Piyush Mishra, Qing Lyu, Dídac Surís, Brian Chen, Susan Windisch Brown, Martha PalmerChris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih Fu Chang, Heng Ji

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a new information extraction system that can automatically construct temporal event graphs from a collection of news documents from multiple sources, multiple languages (English and Spanish for our experiment), and multiple data modalities (speech, text, image and video). The system advances state-of-the-art from two aspects: (1) extending from sentence-level event extraction to cross-document cross-lingual cross-media event extraction, coreference resolution and temporal event tracking; (2) using human curated event schema library to match and enhance the extraction output. We have made the dockerlized system publicly available for research purpose at GitHub, with a demo video.

Original languageEnglish (US)
Title of host publicationNAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies, Demonstrations
PublisherAssociation for Computational Linguistics (ACL)
Pages133-143
Number of pages11
ISBN (Electronic)9781954085480
StatePublished - 2021
Event2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021 - Virtual, Online
Duration: Jun 6 2021Jun 11 2021

Publication series

NameNAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Demonstrations

Conference

Conference2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021
CityVirtual, Online
Period6/6/216/11/21

ASJC Scopus subject areas

  • Information Systems
  • Software
  • Computer Networks and Communications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'RESIN: A Dockerized Schema-Guided Cross-document Cross-lingual Cross-media Information Extraction and Event Tracking System'. Together they form a unique fingerprint.

Cite this