🐦 Cuckoo Filter

Probabilistic membership with deletion support

Overview

A Cuckoo Filter is a space-efficient probabilistic data structure for approximate set membership testing — like a Bloom filter, but with a killer feature: it supports deletion. Based on cuckoo hashing, it stores compact fingerprints in a bucket array and uses the "kick-and-relocate" strategy when buckets are full.

Reference: "Cuckoo Filter: Practically Better Than Bloom" — Fan, Andersen, Kaminsky, Mitzenmacher (CoNEXT 2014).

Concepts Demonstrated

Cuckoo Filter vs Bloom Filter

FeatureBloom FilterCuckoo Filter
Membership test
Deletion
Space efficiencyGoodBetter (for FPR < 3%)
Lookup speedk hash lookups2 bucket lookups
Insert worst caseO(k)O(max_kicks)

API

val create    : int -> t          (* create filter with capacity *)
val insert    : t -> string -> bool (* insert item, returns success *)
val lookup    : t -> string -> bool (* check membership *)
val delete    : t -> string -> bool (* remove item, returns success *)
val size      : t -> int           (* current item count *)
val load_factor : t -> float       (* utilization ratio *)
val stats     : t -> string        (* formatted statistics *)

How It Works

  1. Compute the item's fingerprint (compact hash)
  2. Compute two candidate bucket indices: i1 = hash(item), i2 = i1 ⊕ hash(fingerprint)
  3. If either bucket has space, store the fingerprint there
  4. If both are full, evict a random entry and relocate it to its alternate bucket
  5. Repeat relocation up to max_kicks times

Run It

ocamlfind ocamlopt -package str -linkpkg cuckoo_filter.ml -o cuckoo_filter
./cuckoo_filter

Or simply: ocamlfind ocamlopt cuckoo_filter.ml -o cuckoo_filter && ./cuckoo_filter

Source

cuckoo_filter.ml