TY - GEN
T1 - From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees
AU - Dai, Yifan
AU - Xu, Yien
AU - Ganesan, Aishwarya
AU - Alagappan, Ramnatthan
AU - Kroth, Brian
AU - Arpaci-Dusseau, Andrea C.
AU - Arpaci-Dusseau, Remzi H.
N1 - Funding Information:
We thank Alexandra Fedorova (our shepherd) and the anonymous reviewers of OSDI'20 for their insightful comments and suggestions. We thank the members of ADSL for their excellent feedback. We also thank CloudLab [41] for providing a great environment to run our experiments and reproduce our results during artifact evaluation. This material was supported by funding from NSF grants CNS-1421033, CNS-1763810 and CNS-1838733, Intel, Microsoft, Seagate, and VMware. Aishwarya Ganesan is supported by a Facebook fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and may not reflect the views of NSF or any other institutions.
Publisher Copyright:
© 2020 Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020. All rights reserved.
PY - 2020
Y1 - 2020
N2 - We introduce BOURBON, a log-structured merge (LSM) tree that utilizes machine learning to provide fast lookups. We base the design and implementation of BOURBON on empirically-grounded principles that we derive through careful analysis of LSM design. BOURBON employs greedy piecewise linear regression to learn key distributions, enabling fast lookup with minimal computation, and applies a cost-benefit strategy to decide when learning will be worthwhile. Through a series of experiments on both synthetic and real-world datasets, we show that BOURBON improves lookup performance by 1.23×-1.78× as compared to state-of-the-art production LSMs.
AB - We introduce BOURBON, a log-structured merge (LSM) tree that utilizes machine learning to provide fast lookups. We base the design and implementation of BOURBON on empirically-grounded principles that we derive through careful analysis of LSM design. BOURBON employs greedy piecewise linear regression to learn key distributions, enabling fast lookup with minimal computation, and applies a cost-benefit strategy to decide when learning will be worthwhile. Through a series of experiments on both synthetic and real-world datasets, we show that BOURBON improves lookup performance by 1.23×-1.78× as compared to state-of-the-art production LSMs.
UR - http://www.scopus.com/inward/record.url?scp=85096772684&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096772684&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85096772684
SP - 155
EP - 171
BT - Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2020
PB - USENIX Association
T2 - 14th USENIX Symposium on Operating Systems Design and Implementation,OSDI 2020
Y2 - 4 November 2020 through 6 November 2020
ER -