Reliability of Methods for Extracting Collaboration Networks from Crisis-related Situational Reports and Tweets

Ly Dinh, Sumeet Kulkarni, Pingjing Yang, Jana Diesner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Assessing the effectiveness of crisis response is key to improving preparedness and adapting policies. One method for response evaluation is reviewing actual response activities and interactions. Response reports are often available in the form of natural language text data. Analyzing a large number of such reports requires automated or semi automated solutions. To improve the trustworthiness of methods for this purpose, we empirically validate the reliability of three relation extraction methods that we used to construct interorganizational collaboration networks by comparing them against human-annotated ground truth (crisis-specific situational reports and tweets). For entity extraction, we find that using a combination of two off-the-shelf methods (FlairNLP and SpaCy) is optimal for situational reports data and one method (SpaCy) for tweets data. For relation extraction, we find that a heuristics-based model that we built by leveraging word co-occurrence and deep and shallow syntax as features and training it on domain-specific text data outperforms two state-of-the-art relation extraction models (Stanford OpenTE and OnelE) that were pre-trained on general domain data. We also find that situational reports, on average, contain less entities and relations than tweets, but the extracted networks are more closely related to collaboration activities mentioned in the ground truth. As it is widely known that general domain tools might need adjustment to perform accurately in specific domains, we did not expect the tested off-the-shelf tools to perform highly accurately. Our point is to rather identify what accuracy one could reasonably expect when leveraging available resources as-is for domain specific work (in this case, crisis informatics), what errors (in terms of false positives and false negatives) to expect, and how to account for that.

Original languageEnglish (US)
Title of host publicationISCRAM 2022 - Proceedings
Subtitle of host publicationInformation Systems for Crisis Response and Management Asia Pacific Conference 2022
EditorsThomas J. Huggins, Vincent Lemiale
PublisherInformation Systems for Crisis Response and Management, ISCRAM
Pages181-195
Number of pages15
ISBN (Electronic)9780473668457
StatePublished - 2022
Event2nd Information Systems for Crisis Response and Management Asia Pacific Conference, ISCRAM 2022 - Virtual, Online, Australia
Duration: Nov 7 2022Nov 9 2022

Publication series

NameProceedings of the International ISCRAM Conference
Volume2022-November
ISSN (Electronic)2411-3387

Conference

Conference2nd Information Systems for Crisis Response and Management Asia Pacific Conference, ISCRAM 2022
Country/TerritoryAustralia
CityVirtual, Online
Period11/7/2211/9/22

Keywords

  • Collaboration networks
  • interorganizational collaboration
  • natural language processing
  • situational awareness

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Reliability of Methods for Extracting Collaboration Networks from Crisis-related Situational Reports and Tweets'. Together they form a unique fingerprint.

Cite this