TY - JOUR
T1 - Toward an intensional approach to transformation classification
AU - Renear, Allen H.
AU - Wang, Xinrui
N1 - Publisher Copyright:
Copyright © 2018 by Association for Information Science and Technology
PY - 2018/1
Y1 - 2018/1
N2 - Generating one dataset from another is a fundamental activity in data science: data curators convert datasets to different file formats, create data subsets, generate metadata, integrate data from multiple sources, and so on; data analysts generate summaries and classifications, create visualizations, and derive data about one sort of thing from data about another sort of thing. Sometimes these transformations occur in independent single episodes, sometimes as part of an extended structured process or scientific workflow. Although such transformations have been studied from a variety of perspectives, there has been little effort to develop a general classification based on intrinsic (rather than functional) characteristics, apart from computational complexity. With this paper we hope to motivate a classification of transformations based on the relationships between the Intensional features of the input and output datasets, that is, their propositional and conceptual content. Intensional entities are the fundamental components of scientific reasoning and explanation and consequently deserve a uniquely central role in the analysis of information work. We believe such a classification would be a valuable contribution to the data curation curriculum. This paper is an introduction to that project.
AB - Generating one dataset from another is a fundamental activity in data science: data curators convert datasets to different file formats, create data subsets, generate metadata, integrate data from multiple sources, and so on; data analysts generate summaries and classifications, create visualizations, and derive data about one sort of thing from data about another sort of thing. Sometimes these transformations occur in independent single episodes, sometimes as part of an extended structured process or scientific workflow. Although such transformations have been studied from a variety of perspectives, there has been little effort to develop a general classification based on intrinsic (rather than functional) characteristics, apart from computational complexity. With this paper we hope to motivate a classification of transformations based on the relationships between the Intensional features of the input and output datasets, that is, their propositional and conceptual content. Intensional entities are the fundamental components of scientific reasoning and explanation and consequently deserve a uniquely central role in the analysis of information work. We believe such a classification would be a valuable contribution to the data curation curriculum. This paper is an introduction to that project.
KW - Data science
KW - conceptual foundations
KW - data analytics
KW - data curation
UR - http://www.scopus.com/inward/record.url?scp=85064480022&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064480022&partnerID=8YFLogxK
U2 - 10.1002/pra2.2018.14505501045
DO - 10.1002/pra2.2018.14505501045
M3 - Article
AN - SCOPUS:85064480022
SN - 2373-9231
VL - 55
SP - 414
EP - 419
JO - Proceedings of the Association for Information Science and Technology
JF - Proceedings of the Association for Information Science and Technology
IS - 1
ER -