@article{d89e6d2399764bd2af50a0e924213afb,
title = "Who bears the burden of long-lived molecular biology databases?",
abstract = "In the early 1990s the life sciences quickly adopted online databases to facilitate wide-spread dissemination and use of scientific data. Starting in 1991, the journal Nucleic Acids Research published an annual Database Issue dedicated to articles describing molecular biology databases. Analysis of these articles reveals a set of long-lived databases which have remained available for more than 15 years. Given the pervasive challenge of sustaining community resources, these databases provide an opportunity to examine what factors contribute to persistence by addressing two questions 1) which organizations fund these long-lived databases? and 2) which organizations maintain these long-lived databases? Funding and operating organizations for 67 databases were determined through review of Database Issue articles. The results reveal a diverse set of contributing organizations with financial and operational support spread across six categories: academic, consortium/collective, government, industry, philanthropic, and society/association. The majority of databases reported support from more than one funding organization, of which government organizations were most common source of funds. Operational responsibilities were more distributed, with academic organizations serving as the most common hosts. Although there is evidence of diversification overall, the most acknowledged funding and operating organizations contribute to disproportionately large percentages of the long-lived databases investigated here.",
keywords = "Data sharing, Molecular biology, Online databases, Research infrastructure, Sustainability",
author = "Imker, {Heidi J.}",
note = "Funding Information: To examine the diversity of funding sources, the 89 individual funding organizations were mapped to categories. The results revealed acknowledgement of 29 government, 19 industry, 13 academic, 12 philanthropic, 9 society/association, and 7 consortium/collective organizations. The funding codes were aggregated for each database, and since 28 unique combinations resulted (supplemental Table 2), codes were condensed where multiple funders from the same category were reported. For example, the AAIndex: Amino Acid Index Database included funding from the Ministry of Education Culture Sports Science and Technology of Japan (G), Japan Science and Technology Agency (G), Kyoto University (A), and University of Tokyo (A), which was condensed to AG, representing the two different categories of funders reported. Analysis revealed that the majority of the databases reported a single funding type (n = 38; Figure 1, first panel). Dependence solely on government funding organizations was the most common category (n = 34), making up 54.0% of the 63 databases with funding reported. Only 4 databases (6.3%) did not attribute any funding to any government organizations. The other types of organizations tended to offer support in combination with other sources. The 13 unique academic organizations were attributed in 15 databases (23.8%) yet only a single database relied solely on academic organizations. Similarly, the 12 unique philanthropies were acknowledged for 12 (19.0%) of the databases, only 2 of which reported sole funding by philanthropies. Organizations from industry were the second most numerous category with 19 unique organizations reported, yet 11 of the 19 contributed to a single database, EMBL-EBI{\textquoteright}s IMGT/HLA. The remaining 8 industry organizations were mentioned for 7 other databases. Finally, the 7 unique consortium/collective organizations supported 6 (9.5%) databases, and the 9 unique society/association organizations were reported as funding 4 (6.3%) of the databases. Funding Information: The author is grateful to Miho Funamori for advice on classification of Japanese organizations, and Ashley Hetrick and Chuck Cook for thoughtful review of the manuscript draft. The author is also grateful to Robert Olendorf and Hoa Luong from the Data Curation Network (http://datacurationnetwork.org) for careful curation of the dataset as well as independent verification that the code executed as expected. Funding Information: update article may have been published prior to widespread indexing of funder metadata circa 2009, the full text was manually reviewed to capture funding organizations. Following the heuristics of Grassano et al. (Grassano et al., 2016), organizations had to be explicitly named within the article to be included and were not inferred from other context. Organizations were only counted once per database, regardless if mentioned multiple times within statements (e.g., for distinct grants). While most funding organizations could be interpreted as providing direct financial support to the databases in the form of research grants, other indirect financial support was reported and is included here. This includes support provided through various types of fellowships or computational resources supplied through academic research centers. Exclusions included gratitude to specific people for support that can ostensibly be considered “moral.” Both the text within the NAR article and information found on the database websites were used to determine the organization that currently takes primary responsibility for hosting and operating the database. Specifically, author affiliations, URL domain names, website branding, and “about” pages were reviewed to identify operational homes. Both the country of the operating organization (as ISO 3166-1 alpha-3) and the organization name were recorded. In most cases, this could be resolved to a single organization in a single country. Exceptions include two databases run by consortia with an international composition, and here a code of “INT” was used. Publisher Copyright: {\textcopyright} 2020 The Author(s).",
year = "2020",
month = mar,
day = "4",
doi = "10.5334/dsj-2020-008",
language = "English (US)",
volume = "19",
journal = "Data Science Journal",
issn = "1683-1470",
publisher = "Committee on Data for Science and Technology",
number = "1",
}