Abstract
The "big data" movement promises to deliver better decisions in all aspects of our lives from business to science health, and government by using computational techniques to identify patterns from large historical collections of data. Although a unified view from curation to analysis has been proposed, current research appears to have polarized into two separate groups: those curating large datasets and those developing computational methods to identify patterns in large datasets. The case study presented here demonstrates the enormous impact that parameter tuning can have on the resulting accuracy, precision, and recall of a computational model that is generated from data. It also illustrates the vastness of the parameter space that must be searched in order to produce optimal models and curated in order to avoid redundant experiments. This highlights the need for research that focuses on the gap between collection and analytics if we are to realize the potential of big data.
Original language | English (US) |
---|---|
Journal | Proceedings of the ASIST Annual Meeting |
Volume | 51 |
Issue number | 1 |
DOIs | |
State | Published - 2014 |
Keywords
- Big data
- Data analytics
- Data curation
- Parameter space
ASJC Scopus subject areas
- Information Systems
- Library and Information Sciences