Anti-freeze for large and complex spreadsheets: Asynchronous formula computation

Mangesh Bendre, Tana Wattanawaroon, Kelly Mack, Kevin Chen-Chuan Chang, Aditya G Parameswaran

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Spreadsheet systems enable users to store and analyze data in an intuitive and flexible interface. Yet the scale of data being analyzed often leads to spreadsheets hanging and freezing on small changes. We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background. To ensure consistency, we indicate formulae being computed in the background via visual cues on the spreadsheet. Our asynchronous computation framework introduces two novel challenges: (a) How do we identify dependencies for a given change in a bounded time? (b) How do we schedule computation to maximize the number of spreadsheet cells available to the user over time? We bound the dependency identification time by compressing the formula dependency graph lossily, a problem we show to be NP-Hard. A compressed dependency table enables us to quickly identify the spreadsheet cells that need recomputation and indicate them as such to users. Finding an optimal computation schedule to maximize cell availability is also NP-Hard, and even merely obtaining a schedule can be expensive-we propose an on-the-fly scheduling technique to address this. We have incorporated asynchronous computation in DataSpread, a scalable spreadsheet system targeted at operating on arbitrarily large datasets on a spreadsheet frontend.

Original languageEnglish (US)
Title of host publicationSIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1277-1294
Number of pages18
ISBN (Electronic)9781450356435
DOIs
StatePublished - Jun 25 2019
Event2019 International Conference on Management of Data, SIGMOD 2019 - Amsterdam, Netherlands
Duration: Jun 30 2019Jul 5 2019

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2019 International Conference on Management of Data, SIGMOD 2019
CountryNetherlands
CityAmsterdam
Period6/30/197/5/19

Fingerprint

Spreadsheets
Freezing
Scheduling
Availability

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Bendre, M., Wattanawaroon, T., Mack, K., Chang, K. C-C., & Parameswaran, A. G. (2019). Anti-freeze for large and complex spreadsheets: Asynchronous formula computation. In SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data (pp. 1277-1294). (Proceedings of the ACM SIGMOD International Conference on Management of Data). Association for Computing Machinery. https://doi.org/10.1145/3299869.3319876

Anti-freeze for large and complex spreadsheets : Asynchronous formula computation. / Bendre, Mangesh; Wattanawaroon, Tana; Mack, Kelly; Chang, Kevin Chen-Chuan; Parameswaran, Aditya G.

SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Association for Computing Machinery, 2019. p. 1277-1294 (Proceedings of the ACM SIGMOD International Conference on Management of Data).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Bendre, M, Wattanawaroon, T, Mack, K, Chang, KC-C & Parameswaran, AG 2019, Anti-freeze for large and complex spreadsheets: Asynchronous formula computation. in SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Proceedings of the ACM SIGMOD International Conference on Management of Data, Association for Computing Machinery, pp. 1277-1294, 2019 International Conference on Management of Data, SIGMOD 2019, Amsterdam, Netherlands, 6/30/19. https://doi.org/10.1145/3299869.3319876
Bendre M, Wattanawaroon T, Mack K, Chang KC-C, Parameswaran AG. Anti-freeze for large and complex spreadsheets: Asynchronous formula computation. In SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Association for Computing Machinery. 2019. p. 1277-1294. (Proceedings of the ACM SIGMOD International Conference on Management of Data). https://doi.org/10.1145/3299869.3319876
Bendre, Mangesh ; Wattanawaroon, Tana ; Mack, Kelly ; Chang, Kevin Chen-Chuan ; Parameswaran, Aditya G. / Anti-freeze for large and complex spreadsheets : Asynchronous formula computation. SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data. Association for Computing Machinery, 2019. pp. 1277-1294 (Proceedings of the ACM SIGMOD International Conference on Management of Data).
@inproceedings{ed7458a54cc3462dbd02653e47873e39,
title = "Anti-freeze for large and complex spreadsheets: Asynchronous formula computation",
abstract = "Spreadsheet systems enable users to store and analyze data in an intuitive and flexible interface. Yet the scale of data being analyzed often leads to spreadsheets hanging and freezing on small changes. We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background. To ensure consistency, we indicate formulae being computed in the background via visual cues on the spreadsheet. Our asynchronous computation framework introduces two novel challenges: (a) How do we identify dependencies for a given change in a bounded time? (b) How do we schedule computation to maximize the number of spreadsheet cells available to the user over time? We bound the dependency identification time by compressing the formula dependency graph lossily, a problem we show to be NP-Hard. A compressed dependency table enables us to quickly identify the spreadsheet cells that need recomputation and indicate them as such to users. Finding an optimal computation schedule to maximize cell availability is also NP-Hard, and even merely obtaining a schedule can be expensive-we propose an on-the-fly scheduling technique to address this. We have incorporated asynchronous computation in DataSpread, a scalable spreadsheet system targeted at operating on arbitrarily large datasets on a spreadsheet frontend.",
author = "Mangesh Bendre and Tana Wattanawaroon and Kelly Mack and Chang, {Kevin Chen-Chuan} and Parameswaran, {Aditya G}",
year = "2019",
month = "6",
day = "25",
doi = "10.1145/3299869.3319876",
language = "English (US)",
series = "Proceedings of the ACM SIGMOD International Conference on Management of Data",
publisher = "Association for Computing Machinery",
pages = "1277--1294",
booktitle = "SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data",

}

TY - GEN

T1 - Anti-freeze for large and complex spreadsheets

T2 - Asynchronous formula computation

AU - Bendre, Mangesh

AU - Wattanawaroon, Tana

AU - Mack, Kelly

AU - Chang, Kevin Chen-Chuan

AU - Parameswaran, Aditya G

PY - 2019/6/25

Y1 - 2019/6/25

N2 - Spreadsheet systems enable users to store and analyze data in an intuitive and flexible interface. Yet the scale of data being analyzed often leads to spreadsheets hanging and freezing on small changes. We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background. To ensure consistency, we indicate formulae being computed in the background via visual cues on the spreadsheet. Our asynchronous computation framework introduces two novel challenges: (a) How do we identify dependencies for a given change in a bounded time? (b) How do we schedule computation to maximize the number of spreadsheet cells available to the user over time? We bound the dependency identification time by compressing the formula dependency graph lossily, a problem we show to be NP-Hard. A compressed dependency table enables us to quickly identify the spreadsheet cells that need recomputation and indicate them as such to users. Finding an optimal computation schedule to maximize cell availability is also NP-Hard, and even merely obtaining a schedule can be expensive-we propose an on-the-fly scheduling technique to address this. We have incorporated asynchronous computation in DataSpread, a scalable spreadsheet system targeted at operating on arbitrarily large datasets on a spreadsheet frontend.

AB - Spreadsheet systems enable users to store and analyze data in an intuitive and flexible interface. Yet the scale of data being analyzed often leads to spreadsheets hanging and freezing on small changes. We propose a new asynchronous formula computation framework: instead of freezing the interface we return control to users quickly to ensure interactivity, while computing the formulae in the background. To ensure consistency, we indicate formulae being computed in the background via visual cues on the spreadsheet. Our asynchronous computation framework introduces two novel challenges: (a) How do we identify dependencies for a given change in a bounded time? (b) How do we schedule computation to maximize the number of spreadsheet cells available to the user over time? We bound the dependency identification time by compressing the formula dependency graph lossily, a problem we show to be NP-Hard. A compressed dependency table enables us to quickly identify the spreadsheet cells that need recomputation and indicate them as such to users. Finding an optimal computation schedule to maximize cell availability is also NP-Hard, and even merely obtaining a schedule can be expensive-we propose an on-the-fly scheduling technique to address this. We have incorporated asynchronous computation in DataSpread, a scalable spreadsheet system targeted at operating on arbitrarily large datasets on a spreadsheet frontend.

UR - http://www.scopus.com/inward/record.url?scp=85069494820&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069494820&partnerID=8YFLogxK

U2 - 10.1145/3299869.3319876

DO - 10.1145/3299869.3319876

M3 - Conference contribution

AN - SCOPUS:85069494820

T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data

SP - 1277

EP - 1294

BT - SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data

PB - Association for Computing Machinery

ER -