Toward an intensional approach to transformation classification

Allen H. Renear, Xinrui Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Generating one dataset from another is a fundamental activity in data science: data curators convert datasets to different file formats, create data subsets, generate metadata, integrate data from multiple sources, and so on; data analysts generate summaries and classifications, create visualizations, and derive data about one sort of thing from data about another sort of thing. Sometimes these transformations occur in independent single episodes, sometimes as part of an extended structured process or scientific workflow. Although such transformations have been studied from a variety of perspectives, there has been little effort to develop a general classification based on intrinsic (rather than functional) characteristics, apart from computational complexity. With this paper we hope to motivate a classification of transformations based on the relationships between the Intensional features of the input and output datasets, that is, their propositional and conceptual content. Intensional entities are the fundamental components of scientific reasoning and explanation and consequently deserve a uniquely central role in the analysis of information work. We believe such a classification would be a valuable contribution to the data curation curriculum. This paper is an introduction to that project.

Original languageEnglish (US)
Pages (from-to)414-419
Number of pages6
JournalProceedings of the Association for Information Science and Technology
Volume55
Issue number1
DOIs
StatePublished - Jan 2018

Keywords

  • Data science
  • conceptual foundations
  • data analytics
  • data curation

ASJC Scopus subject areas

  • Computer Science(all)
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'Toward an intensional approach to transformation classification'. Together they form a unique fingerprint.

Cite this