TY - GEN
T1 - Finding Keystone Citations for Constructing Validity Chains among Research Papers
AU - Fu, Yuanxi
AU - Schneider, Jodi
AU - Blake, Catherine
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/4/19
Y1 - 2021/4/19
N2 - New discoveries in science are often built upon previous knowledge. Ideally, such dependency information should be made explicit in a scientific knowledge graph. The Keystone Framework was proposed for tracking the validity dependency among papers. A keystone citation indicates that the validity of a given paper depends on a previously published paper it cites. In this paper, we propose and evaluate a strategy that repurposes rhetorical category classifiers for the novel application of extracting keystone citations that relate to research methods. Five binary rhetorical category classifiers were constructed to identify Background, Objective, Methods, Results, and Conclusions sentences in biomedical papers. The resulting classifiers were used to test the strategy against two datasets. The initial strategy assumed that only citations contained in Methods sentences were methods keystone citations, but our analysis revealed that citations contained in sentences classified as either Methods or Results had a high likelihood to be methods keystone citations. Future work will focus on fine tuning the rhetorical category classifiers, experimenting with multiclass classifiers, evaluating the revised strategy with more data, and constructing a larger gold standard citation context sentence dataset for model training.
AB - New discoveries in science are often built upon previous knowledge. Ideally, such dependency information should be made explicit in a scientific knowledge graph. The Keystone Framework was proposed for tracking the validity dependency among papers. A keystone citation indicates that the validity of a given paper depends on a previously published paper it cites. In this paper, we propose and evaluate a strategy that repurposes rhetorical category classifiers for the novel application of extracting keystone citations that relate to research methods. Five binary rhetorical category classifiers were constructed to identify Background, Objective, Methods, Results, and Conclusions sentences in biomedical papers. The resulting classifiers were used to test the strategy against two datasets. The initial strategy assumed that only citations contained in Methods sentences were methods keystone citations, but our analysis revealed that citations contained in sentences classified as either Methods or Results had a high likelihood to be methods keystone citations. Future work will focus on fine tuning the rhetorical category classifiers, experimenting with multiclass classifiers, evaluating the revised strategy with more data, and constructing a larger gold standard citation context sentence dataset for model training.
KW - Knowledge dependency
KW - argumentation
KW - citation context classification
KW - methods
KW - validity
UR - http://www.scopus.com/inward/record.url?scp=85107684051&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107684051&partnerID=8YFLogxK
U2 - 10.1145/3442442.3451368
DO - 10.1145/3442442.3451368
M3 - Conference contribution
AN - SCOPUS:85107684051
T3 - The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021
SP - 451
EP - 455
BT - The Web Conference 2021 - Companion of the World Wide Web Conference, WWW 2021
PB - Association for Computing Machinery
T2 - 30th World Wide Web Conference, WWW 2021
Y2 - 19 April 2021 through 23 April 2021
ER -