TY - GEN
T1 - Reliability of Methods for Extracting Collaboration Networks from Crisis-related Situational Reports and Tweets
AU - Dinh, Ly
AU - Kulkarni, Sumeet
AU - Yang, Pingjing
AU - Diesner, Jana
N1 - Publisher Copyright:
© 2022 Information Systems for Crisis Response and Management, ISCRAM. All rights reserved.
PY - 2022
Y1 - 2022
N2 - Assessing the effectiveness of crisis response is key to improving preparedness and adapting policies. One method for response evaluation is reviewing actual response activities and interactions. Response reports are often available in the form of natural language text data. Analyzing a large number of such reports requires automated or semi automated solutions. To improve the trustworthiness of methods for this purpose, we empirically validate the reliability of three relation extraction methods that we used to construct interorganizational collaboration networks by comparing them against human-annotated ground truth (crisis-specific situational reports and tweets). For entity extraction, we find that using a combination of two off-the-shelf methods (FlairNLP and SpaCy) is optimal for situational reports data and one method (SpaCy) for tweets data. For relation extraction, we find that a heuristics-based model that we built by leveraging word co-occurrence and deep and shallow syntax as features and training it on domain-specific text data outperforms two state-of-the-art relation extraction models (Stanford OpenTE and OnelE) that were pre-trained on general domain data. We also find that situational reports, on average, contain less entities and relations than tweets, but the extracted networks are more closely related to collaboration activities mentioned in the ground truth. As it is widely known that general domain tools might need adjustment to perform accurately in specific domains, we did not expect the tested off-the-shelf tools to perform highly accurately. Our point is to rather identify what accuracy one could reasonably expect when leveraging available resources as-is for domain specific work (in this case, crisis informatics), what errors (in terms of false positives and false negatives) to expect, and how to account for that.
AB - Assessing the effectiveness of crisis response is key to improving preparedness and adapting policies. One method for response evaluation is reviewing actual response activities and interactions. Response reports are often available in the form of natural language text data. Analyzing a large number of such reports requires automated or semi automated solutions. To improve the trustworthiness of methods for this purpose, we empirically validate the reliability of three relation extraction methods that we used to construct interorganizational collaboration networks by comparing them against human-annotated ground truth (crisis-specific situational reports and tweets). For entity extraction, we find that using a combination of two off-the-shelf methods (FlairNLP and SpaCy) is optimal for situational reports data and one method (SpaCy) for tweets data. For relation extraction, we find that a heuristics-based model that we built by leveraging word co-occurrence and deep and shallow syntax as features and training it on domain-specific text data outperforms two state-of-the-art relation extraction models (Stanford OpenTE and OnelE) that were pre-trained on general domain data. We also find that situational reports, on average, contain less entities and relations than tweets, but the extracted networks are more closely related to collaboration activities mentioned in the ground truth. As it is widely known that general domain tools might need adjustment to perform accurately in specific domains, we did not expect the tested off-the-shelf tools to perform highly accurately. Our point is to rather identify what accuracy one could reasonably expect when leveraging available resources as-is for domain specific work (in this case, crisis informatics), what errors (in terms of false positives and false negatives) to expect, and how to account for that.
KW - Collaboration networks
KW - interorganizational collaboration
KW - natural language processing
KW - situational awareness
UR - http://www.scopus.com/inward/record.url?scp=85165942707&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85165942707&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85165942707
T3 - Proceedings of the International ISCRAM Conference
SP - 181
EP - 195
BT - ISCRAM 2022 - Proceedings
A2 - Huggins, Thomas J.
A2 - Lemiale, Vincent
PB - Information Systems for Crisis Response and Management, ISCRAM
T2 - 2nd Information Systems for Crisis Response and Management Asia Pacific Conference, ISCRAM 2022
Y2 - 7 November 2022 through 9 November 2022
ER -