Abstract
A new generation of data processing systems, including web search, Google’s Knowledge Graph, IBM’s Watson, and several different recommendation systems, combine rich databases with software driven by machine learning. The spectacular successes of these trained systems have been among the most notable in all of computing and have generated excitement in health care, finance, energy, and general business. But building them can be challenging, even for computer scientists with PhD-level training. If these systems are to have a truly broad impact, building them must become easier. We explore one crucial pain point in the construction of trained systems: feature engineering. Given the sheer size of modern datasets, feature developers must (1) write code with few effective clues about how their code will interact with the data and (2) repeatedly endure long system waits even though their code typically changes little from run to run. We propose brainwash, a vision for a feature engineering data system that could dramatically ease the Explore-Extract-Evaluate interaction loop that characterizes many trained system projects.
Original language | English (US) |
---|---|
State | Published - 2013 |
Externally published | Yes |
Event | 6th Biennial Conference on Innovative Data Systems Research, CIDR 2013 - Pacific Grove, United States Duration: Jan 6 2013 → Jan 9 2013 |
Conference
Conference | 6th Biennial Conference on Innovative Data Systems Research, CIDR 2013 |
---|---|
Country/Territory | United States |
City | Pacific Grove |
Period | 1/6/13 → 1/9/13 |
ASJC Scopus subject areas
- Hardware and Architecture
- Information Systems and Management
- Artificial Intelligence
- Information Systems