TY - GEN
T1 - Benchmarking Spreadsheet Systems
AU - Rahman, Sajjadur
AU - Mack, Kelly
AU - Bendre, Mangesh
AU - Zhang, Ruilin
AU - Karahalios, Karrie
AU - Parameswaran, Aditya G
N1 - Funding Information:
We thank the anonymous reviewers for their valuable feedback. We also thank Richard Lin for help in re-running experiments for Google Sheets. We acknowledge support from grants IIS-1652750 and IIS-1733878 awarded by the National Science Foundation, grant W911NF-18-1-0335 awarded by the Army, and funds from Adobe, Capital One, Facebook, Google, Siebel Energy Institute, and the Toyota Research Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies and organizations.
Publisher Copyright:
© 2020 Association for Computing Machinery.
PY - 2020/6/14
Y1 - 2020/6/14
N2 - Spreadsheet systems are used for storing and analyzing data across domains by programmers and non-programmers alike.While spreadsheet systems have continued to support increasingly large datasets, they are prone to hanging and freezing while performing computations even on much smaller ones. We present a benchmarking study that evaluates and compares the performance of three popular systems, Microsoft Excel, LibreOffice Calc, and Google Sheets, on a range of canonical spreadsheet computation operations. We find that spreadsheet systems lack interactivity for several operations, on datasets well below their advertised scalability limits. We further evaluate whether spreadsheet systems adopt database optimization techniques such as indexing, intelligent data layout, and incremental and shared computation,to efficiently execute computation operations. We outline several ways future spreadsheet systems can be redesigned to offer interactive response times on large datasets.
AB - Spreadsheet systems are used for storing and analyzing data across domains by programmers and non-programmers alike.While spreadsheet systems have continued to support increasingly large datasets, they are prone to hanging and freezing while performing computations even on much smaller ones. We present a benchmarking study that evaluates and compares the performance of three popular systems, Microsoft Excel, LibreOffice Calc, and Google Sheets, on a range of canonical spreadsheet computation operations. We find that spreadsheet systems lack interactivity for several operations, on datasets well below their advertised scalability limits. We further evaluate whether spreadsheet systems adopt database optimization techniques such as indexing, intelligent data layout, and incremental and shared computation,to efficiently execute computation operations. We outline several ways future spreadsheet systems can be redesigned to offer interactive response times on large datasets.
KW - scalability
KW - spreadsheet systems
KW - use cases
UR - http://www.scopus.com/inward/record.url?scp=85086242551&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086242551&partnerID=8YFLogxK
U2 - 10.1145/3318464.3389782
DO - 10.1145/3318464.3389782
M3 - Conference contribution
AN - SCOPUS:85086242551
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1589
EP - 1599
BT - SIGMOD 2020 - Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
PB - Association for Computing Machinery
T2 - 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020
Y2 - 14 June 2020 through 19 June 2020
ER -