TY - JOUR
T1 - The disambiguation of people names in biological collections
AU - Groom, Quentin
AU - Bräuchler, Christian
AU - Cubey, Robert W.N.
AU - Dillen, Mathias
AU - Huybrechts, Pieter
AU - Kearney, Nicole
AU - Klazenga, Niels
AU - Leachman, Siobhan
AU - Paul, Deborah L.
AU - Rogers, Heather
AU - Santos, Joaquim
AU - Shorthouse, David Peter
AU - Vaughan, Alison
AU - von Mering, Sabine
AU - Haston, Elspeth M.
N1 - Publisher Copyright:
© 2022,Biodiversity Data Journal. All Rights Reserved.
PY - 2022
Y1 - 2022
N2 - Scientific collections have been built by people. For hundreds of years, people have collected, studied, identified, preserved, documented and curated collection specimens. Understanding who those people are is of interest to historians, but much more can be made of these data by other stakeholders once they have been linked to the people’s identities and their biographies. Knowing who people are helps us attribute work correctly, validate data and understand the scientific contribution of people and institutions. We can evaluate the work they have done, the interests they have, the places they have worked and what they have created from the specimens they have collected. The problem is that all we know about most of the people associated with collections are their names written on specimens. Disambiguating these people is the challenge that this paper addresses. Disambiguation of people often proves difficult in isolation and can result in staff or researchers independently trying to determine the identity of specific individuals over and over again. By sharing biographical data and building an open, collectively maintained dataset with shared knowledge, expertise and resources, it is possible to collectively deduce the identities of individuals, aggregate biographical information for each person, reduce duplication of effort and share the information locally and globally. The authors of this paper aspire to disambiguate all person names efficiently and fully in all their variations across the entirety of the biological sciences, starting with collections. Towards that vision, this paper has three key aims: to improve the linking, validation, enhancement and valorisation of person-related information within and between collections, databases and publications; to suggest good practice for identifying people involved in biological collections; and to promote coordination amongst all stakeholders, including individuals, natural history collections, institutions, learned societies, government agencies and data aggregators.
AB - Scientific collections have been built by people. For hundreds of years, people have collected, studied, identified, preserved, documented and curated collection specimens. Understanding who those people are is of interest to historians, but much more can be made of these data by other stakeholders once they have been linked to the people’s identities and their biographies. Knowing who people are helps us attribute work correctly, validate data and understand the scientific contribution of people and institutions. We can evaluate the work they have done, the interests they have, the places they have worked and what they have created from the specimens they have collected. The problem is that all we know about most of the people associated with collections are their names written on specimens. Disambiguating these people is the challenge that this paper addresses. Disambiguation of people often proves difficult in isolation and can result in staff or researchers independently trying to determine the identity of specific individuals over and over again. By sharing biographical data and building an open, collectively maintained dataset with shared knowledge, expertise and resources, it is possible to collectively deduce the identities of individuals, aggregate biographical information for each person, reduce duplication of effort and share the information locally and globally. The authors of this paper aspire to disambiguate all person names efficiently and fully in all their variations across the entirety of the biological sciences, starting with collections. Towards that vision, this paper has three key aims: to improve the linking, validation, enhancement and valorisation of person-related information within and between collections, databases and publications; to suggest good practice for identifying people involved in biological collections; and to promote coordination amongst all stakeholders, including individuals, natural history collections, institutions, learned societies, government agencies and data aggregators.
KW - Attribution
KW - Authority file
KW - Biography
KW - Linked open data
KW - Wikidata
UR - http://www.scopus.com/inward/record.url?scp=85141308043&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85141308043&partnerID=8YFLogxK
U2 - 10.3897/BDJ.10.e86089
DO - 10.3897/BDJ.10.e86089
M3 - Article
C2 - 36761559
AN - SCOPUS:85141308043
SN - 1314-2828
VL - 10
JO - Biodiversity Data Journal
JF - Biodiversity Data Journal
M1 - e86089
ER -