Distributed key-value stores power the backend of high-performance web services and cloud computing applications. Key-value stores such as Cassandra rely heavily on counters to track occurrences of various kinds of events. However, modern implementations of counters do not provide exactly-once semantics. E.g., a client may request a counter increment, time out waiting for a response, and create a duplicate request resulting in a double increment at the server. In this paper, we address this problem by presenting, analyzing, and evaluating a novel server-side data structure called the forgetful bloom filter (FBF). Like a traditional Bloom filter, an FBF is a compact representation of a set of elements (e.g., client requests). However, an FBF: (i) can forget older elements (e.g., requests that are too old to apply), and (ii) adapts itself to meet a desired false positive rate under a varying workload. We present experimental results from a prototype implementation of FBFs and an implementation of FBFs in the Cassandra key-value store. Our results show that the FBF is highly accurate in maintaining correct counter values.
- Bloom filters
- Idempotent distributed operations
ASJC Scopus subject areas
- Computer Networks and Communications