TY - GEN
T1 - Anti-freeze for large and complex spreadsheets
T2 - 2019 International Conference on Management of Data, SIGMOD 2019
AU - Bendre, Mangesh
AU - Wattanawaroon, Tana
AU - Mack, Kelly
AU - Chang, Kevin
AU - Parameswaran, Aditya
N1 - We thank the anonymous reviewers for their valuable feedback. We acknowledge support from grants IIS-1513407, IIS-1633755, IIS-1652750, and IIS-1733878 awarded by the National Science Foundation, grant W911NF-18-1-0335 awarded by the Army, and funds from Adobe, Toyota Research Institute, and Google.
PY - 2019/6/25
Y1 - 2019/6/25
N2 - Spreadsheet systems enable users to store and analyze data in an intuitive and flexible interface. Yet the scale of data being analyzed often leads to spreadsheets hanging and freezing on small changes. We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background. To ensure consistency, we indicate formulae being computed in the background via visual cues on the spreadsheet. Our asynchronous computation framework introduces two novel challenges: (a) How do we identify dependencies for a given change in a bounded time? (b) How do we schedule computation to maximize the number of spreadsheet cells available to the user over time? We bound the dependency identification time by compressing the formula dependency graph lossily, a problem we show to be NP-Hard. A compressed dependency table enables us to quickly identify the spreadsheet cells that need recomputation and indicate them as such to users. Finding an optimal computation schedule to maximize cell availability is also NP-Hard, and even merely obtaining a schedule can be expensive-we propose an on-the-fly scheduling technique to address this. We have incorporated asynchronous computation in DataSpread, a scalable spreadsheet system targeted at operating on arbitrarily large datasets on a spreadsheet frontend.
AB - Spreadsheet systems enable users to store and analyze data in an intuitive and flexible interface. Yet the scale of data being analyzed often leads to spreadsheets hanging and freezing on small changes. We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background. To ensure consistency, we indicate formulae being computed in the background via visual cues on the spreadsheet. Our asynchronous computation framework introduces two novel challenges: (a) How do we identify dependencies for a given change in a bounded time? (b) How do we schedule computation to maximize the number of spreadsheet cells available to the user over time? We bound the dependency identification time by compressing the formula dependency graph lossily, a problem we show to be NP-Hard. A compressed dependency table enables us to quickly identify the spreadsheet cells that need recomputation and indicate them as such to users. Finding an optimal computation schedule to maximize cell availability is also NP-Hard, and even merely obtaining a schedule can be expensive-we propose an on-the-fly scheduling technique to address this. We have incorporated asynchronous computation in DataSpread, a scalable spreadsheet system targeted at operating on arbitrarily large datasets on a spreadsheet frontend.
UR - http://www.scopus.com/inward/record.url?scp=85069494820&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85069494820&partnerID=8YFLogxK
U2 - 10.1145/3299869.3319876
DO - 10.1145/3299869.3319876
M3 - Conference contribution
AN - SCOPUS:85069494820
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1277
EP - 1294
BT - SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 30 June 2019 through 5 July 2019
ER -