🗳️ Raft Consensus

Distributed consensus via leader election and log replication

Overview

Raft is a consensus algorithm designed to be understandable. It ensures a cluster of nodes agrees on a sequence of commands even when some nodes fail. Unlike Paxos, Raft decomposes consensus into three sub-problems: leader election, log replication, and safety.

This implementation simulates a full Raft cluster with deterministic in-memory message passing, including network partition scenarios for testing fault tolerance.

Concepts Demonstrated

Record types — rich message types (AppendEntries, RequestVote) with named fields
Variant types — node roles (Follower | Candidate | Leader) and message types
Mutable state with records — per-node state (term, log, commit index) updated via ref cells
Pattern matching on messages — dispatching RPCs to handler functions
Functional simulation — deterministic network model for reproducible testing
List operations — log manipulation using immutable lists with List.nth, List.length

How Raft Works

(* Node roles *)
type role = Follower | Candidate | Leader

(* Each node maintains: *)
type node_state = {
  mutable current_term : int;     (* latest term seen *)
  mutable voted_for    : int option; (* candidate voted for *)
  mutable log          : log_entry list;  (* command log *)
  mutable commit_index : int;     (* highest committed entry *)
  mutable role         : role;
  ...
}

(* Message types mirror the Raft paper *)
type message =
  | AppendEntries of append_entries_req
  | AppendEntriesResp of append_entries_resp
  | RequestVote of request_vote_req
  | RequestVoteResp of request_vote_resp
  | ClientRequest of string

(* Leader election: candidate requests votes *)
(* Log replication: leader sends AppendEntries *)
(* Commitment: entry committed when replicated to majority *)

Key Protocol Phases

Phase	Trigger	Description
Leader Election	Election timeout expires	Follower becomes candidate, increments term, requests votes from peers. Wins with majority.
Log Replication	Client submits command	Leader appends to local log, sends AppendEntries RPCs to all followers.
Commitment	Majority acknowledgment	Once a majority of nodes have replicated an entry, leader marks it committed.
Term Transition	Higher term discovered	Any node seeing a higher term steps down to follower and updates its term.
Network Partition	Simulated link failure	Partitioned nodes elect a new leader; old leader steps down on reconnection.

Safety Properties

Election Safety: At most one leader per term
Leader Append-Only: Leader never overwrites or deletes log entries
Log Matching: If two logs have an entry with the same index and term, all preceding entries are identical
Leader Completeness: A committed entry appears in all future leaders' logs
State Machine Safety: If a node applies an entry at index i, no other node applies a different entry at i

Running

# Compile and run the simulation
ocamlfind ocamlopt -package unix -linkpkg raft.ml -o raft && ./raft

# Or with bytecode
ocamlfind ocamlc -package unix -linkpkg raft.ml -o raft && ./raft