TY - GEN
T1 - Towards a holistic integration of spreadsheets with databases
T2 - 34th IEEE International Conference on Data Engineering, ICDE 2018
AU - Bendre, Mangesh
AU - Venkataraman, Vipul
AU - Zhou, Xinyan
AU - Chang, Kevin
AU - Parameswaran, Aditya
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/24
Y1 - 2018/10/24
N2 - Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make the first step towards this vision for relational databases: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a relational database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine for PDM. We develop a natural set of mechanisms for flexibly representing spreadsheet data and demonstrate that identifying the optimal representation is NP-Hard; however, we develop an efficient approach to identify the optimal representation from an important and intuitive subclass of representations. We extend our mechanisms with positional access mechanisms that don't suffer from cascading update issues, leading to constant time access and modification performance. We evaluate these representations on a workload of typical spreadsheets and spreadsheet operations, providing up to 50% reduction in storage, and up to 50% reduction in formula evaluation time.
AB - Spreadsheet software is the tool of choice for interactive ad-hoc data management, with adoption by billions of users. However, spreadsheets are not scalable, unlike database systems. On the other hand, database systems, while highly scalable, do not support interactivity as a first-class primitive. We are developing DataSpread, to holistically integrate spreadsheets as a front-end interface with databases as a back-end datastore, providing scalability to spreadsheets, and interactivity to databases, an integration we term presentational data management (PDM). In this paper, we make the first step towards this vision for relational databases: developing a storage engine for PDM, studying how to flexibly represent spreadsheet data within a relational database and how to support and maintain access by position. We first conduct an extensive survey of spreadsheet use to motivate our functional requirements for a storage engine for PDM. We develop a natural set of mechanisms for flexibly representing spreadsheet data and demonstrate that identifying the optimal representation is NP-Hard; however, we develop an efficient approach to identify the optimal representation from an important and intuitive subclass of representations. We extend our mechanisms with positional access mechanisms that don't suffer from cascading update issues, leading to constant time access and modification performance. We evaluate these representations on a workload of typical spreadsheets and spreadsheet operations, providing up to 50% reduction in storage, and up to 50% reduction in formula evaluation time.
KW - DATASPREAD
KW - PDM
KW - Spreadsheet
KW - database
KW - storage
UR - http://www.scopus.com/inward/record.url?scp=85057140122&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057140122&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2018.00020
DO - 10.1109/ICDE.2018.00020
M3 - Conference contribution
AN - SCOPUS:85057140122
T3 - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
SP - 113
EP - 124
BT - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 16 April 2018 through 19 April 2018
ER -