Identifying content and levels of representation in scientific data

Research output: Contribution to journalArticlepeer-review

Abstract

Heterogeneous digital data that has been produced by different communities with varying practices and assumptions, and that is organized according to different representation schemes, encodings, and file formats, presents substantial obstacles to efficient integration, analysis, and preservation. This is a particular impediment to data reuse and interdisciplinary science. An underlying problem is that we have no shared formal conceptual model of information representation that is both accurate and sufficiently detailed to accommodate the management and analysis of real world digital data in varying formats. Developing such a model involves confronting extremely challenging foundational problems in information science. We present two complementary conceptual models for data representation, the Basic Representation Model and the Systematic Assertion Model. We show how these models work together to provide an analytical account of digitally encoded scientific data. These models will provide a better foundation for understanding and supporting a wide range of data curation activities, including format migration, data integration, data reuse, digital preservation strategies, and assessment of identity and scientific equivalence.

Original languageEnglish (US)
Pages (from-to)1-10
Number of pages10
JournalProceedings of the ASIST Annual Meeting
Volume49
Issue number1
DOIs
StatePublished - 2012

Keywords

  • Conceptual modeling
  • Data curation
  • Identity
  • Information organization
  • Representation
  • Scientific equivalence

ASJC Scopus subject areas

  • Information Systems
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Identifying content and levels of representation in scientific data'. Together they form a unique fingerprint.

Cite this