Abstract
Distributed key-value stores power the backend of high-performance web services and cloud computing applications. Key-value stores such as Cassandra rely heavily on counters to track occurrences of various kinds of events. However, modern implementations of counters do not provide exactly-once semantics. E.g., a client may request a counter increment, time out waiting for a response, and create a duplicate request resulting in a double increment at the server. In this paper, we address this problem by presenting, analyzing, and evaluating a novel server-side data structure called the forgetful bloom filter (FBF). Like a traditional Bloom filter, an FBF is a compact representation of a set of elements (e.g., client requests). However, an FBF: (i) can forget older elements (e.g., requests that are too old to apply), and (ii) adapts itself to meet a desired false positive rate under a varying workload. We present experimental results from a prototype implementation of FBFs and an implementation of FBFs in the Cassandra key-value store. Our results show that the FBF is highly accurate in maintaining correct counter values.
Original language | English (US) |
---|---|
Pages (from-to) | 879-892 |
Number of pages | 14 |
Journal | Cluster Computing |
Volume | 19 |
Issue number | 2 |
DOIs | |
State | Published - Jun 1 2016 |
Keywords
- Bloom filters
- Idempotent distributed operations
- NoSQL
ASJC Scopus subject areas
- Software
- Computer Networks and Communications