TY - JOUR
T1 - Liberating host–virus knowledge from biological dark data
AU - Upham, Nathan S.
AU - Poelen, Jorrit H.
AU - Paul, Deborah
AU - Groom, Quentin J.
AU - Simmons, Nancy B.
AU - Vanhove, Maarten P.M.
AU - Bertolino, Sandro
AU - Reeder, Dee Ann M.
AU - Bastos-Silveira, Cristiane
AU - Sen, Atriya
AU - Sterner, Beckett
AU - Franz, Nico M.
AU - Guidoti, Marcus
AU - Penev, Lyubomir
AU - Agosti, Donat
N1 - Funding Information:
NSU is supported by the Biodiversity Knowledge Integration Center at Arizona State University and the US National Institutes of Health (grant number 1R01AI151144-01A1). DMR is supported by the US National Science Foundation (RAPID 2032774) and the US National Institutes of Health (grant number 1R01AI151144-01A1). NMF and BS are supported by the Arizona State University President's Special Initiative Funds. QJG is supported by the SYNTHESYS+ Research and Innovation action grant Horizon 2020-EU.1.4.1.2823827. DA is supported by the Arcadia Fund. DP is supported by the US National Science Foundation Advancing Digitization of Biodiversity Collections Program grant number DBI-1547229. MPMV is supported by the Special Research Fund of Hasselt University (grant number BOF20TT06). JHP is supported by the US National Science Foundation award Collaborative Research: Digitization TCN: Digitizing collections to trace parasite-host associations and predict the spread of vector-borne disease (award numbers DBI:1901932 and DBI:1901926). NSU, DMR, BS, AS, JHP, and DA are supported by the US National Institutes of Health (1R21AI164268-01). We thank Ana Casino, Dimitris Koureas, and Wouter Addink for organising the Consortium of European Taxonomic Facilities-Distributed System of Scientific Collections COVID-19 Taskforce that resulted in this Viewpoint, Esther Florsheim for valuable conversations, and BioRender, Pipat Soisook, and Emily Damstra for access to images or illustrations.
Publisher Copyright:
© 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license
PY - 2021/10
Y1 - 2021/10
N2 - Connecting basic data about bats and other potential hosts of SARS-CoV-2 with their ecological context is crucial to the understanding of the emergence and spread of the virus. However, when lockdowns in many countries started in March, 2020, the world's bat experts were locked out of their research laboratories, which in turn impeded access to large volumes of offline ecological and taxonomic data. Pandemic lockdowns have brought to attention the long-standing problem of so-called biological dark data: data that are published, but disconnected from digital knowledge resources and thus unavailable for high-throughput analysis. Knowledge of host-to-virus ecological interactions will be biased until this challenge is addressed. In this Viewpoint, we outline two viable solutions: first, in the short term, to interconnect published data about host organisms, viruses, and other pathogens; and second, to shift the publishing framework beyond unstructured text (the so-called PDF prison) to labelled networks of digital knowledge. As the indexing system for biodiversity data, biological taxonomy is foundational to both solutions. Building digitally connected knowledge graphs of host–pathogen interactions will establish the agility needed to quickly identify reservoir hosts of novel zoonoses, allow for more robust predictions of emergence, and thereby strengthen human and planetary health systems.
AB - Connecting basic data about bats and other potential hosts of SARS-CoV-2 with their ecological context is crucial to the understanding of the emergence and spread of the virus. However, when lockdowns in many countries started in March, 2020, the world's bat experts were locked out of their research laboratories, which in turn impeded access to large volumes of offline ecological and taxonomic data. Pandemic lockdowns have brought to attention the long-standing problem of so-called biological dark data: data that are published, but disconnected from digital knowledge resources and thus unavailable for high-throughput analysis. Knowledge of host-to-virus ecological interactions will be biased until this challenge is addressed. In this Viewpoint, we outline two viable solutions: first, in the short term, to interconnect published data about host organisms, viruses, and other pathogens; and second, to shift the publishing framework beyond unstructured text (the so-called PDF prison) to labelled networks of digital knowledge. As the indexing system for biodiversity data, biological taxonomy is foundational to both solutions. Building digitally connected knowledge graphs of host–pathogen interactions will establish the agility needed to quickly identify reservoir hosts of novel zoonoses, allow for more robust predictions of emergence, and thereby strengthen human and planetary health systems.
UR - http://www.scopus.com/inward/record.url?scp=85118286048&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118286048&partnerID=8YFLogxK
U2 - 10.1016/S2542-5196(21)00196-0
DO - 10.1016/S2542-5196(21)00196-0
M3 - Review article
C2 - 34562356
AN - SCOPUS:85118286048
SN - 2542-5196
VL - 5
SP - e746-e750
JO - The Lancet Planetary Health
JF - The Lancet Planetary Health
IS - 10
ER -