Skip to content

Chaos Simulator

Chaos engineering for AI replication systems. Injects controlled failures (network partitions, resource exhaustion, clock skew, Byzantine faults) to test system resilience and safety-critical invariants under stress.

Quick Start

from replication.chaos import ChaosSimulator, ChaosConfig
from replication.simulator import ScenarioConfig

# Run chaos simulation with default scenarios
chaos = ChaosSimulator()
report = chaos.run()

print(f"Safety maintained: {report.safety_maintained}")
print(f"Scenarios run: {len(report.scenario_results)}")
for result in report.scenario_results:
    print(f"  {result.name}: {'PASS' if result.safe else 'FAIL'}")

# Render human-readable report
renderer = ChaosRenderer()
print(renderer.render(report))

Key Classes

  • ChaosSimulator — Orchestrates chaos scenarios against a simulated replication environment and validates safety invariants.
  • ChaosConfig — Controls which fault types to inject, intensity, and safety thresholds.
  • ChaosReport — Aggregated results: safety_maintained (bool), per-scenario outcomes, and identified vulnerabilities.
  • ChaosRenderer — Produces human-readable text reports.

chaos

Chaos Testing Framework for AI replication safety.

Injects random faults during simulation runs and measures whether safety controls (controller, contract, signer) handle them correctly. Think "chaos monkey for AI replication safety."

Fault types
  1. WorkerCrash — randomly kill workers mid-execution
  2. ResourceExhaustion — artificially inflate resource usage past limits
  3. ControllerDelay — add random latency to controller approval decisions
  4. ManifestCorruption — flip bits in worker manifests to test signature verification
  5. TaskFlood — inject burst of simultaneous replication requests
  6. NetworkPartition — simulate workers unable to reach the controller

Usage (CLI)::

python -m replication.chaos                           # default config
python -m replication.chaos --runs 20                 # more runs per fault
python -m replication.chaos --faults crash,flood      # specific faults
python -m replication.chaos --intensity 0.8           # high intensity
python -m replication.chaos --json                    # JSON output
python -m replication.chaos --seed 42                 # reproducible
python -m replication.chaos --summary-only            # brief output

Programmatic::

from replication.chaos import ChaosRunner, ChaosConfig, FaultType
config = ChaosConfig.default()
runner = ChaosRunner(config)
report = runner.run()
print(runner.render(report))

FaultType

Bases: Enum

Fault types that can be injected during chaos testing.

FaultConfig dataclass

Configuration for a single fault injection.

default(fault_type: FaultType, intensity: float = 0.5) -> 'FaultConfig' classmethod

Create a FaultConfig with sensible defaults for the given type.

ChaosConfig dataclass

Configuration for a chaos testing run.

default() -> 'ChaosConfig' classmethod

Create a ChaosConfig with all fault types at medium intensity.

ChaosResult dataclass

Result of chaos testing for a single fault type.

ChaosReport dataclass

Aggregated results of chaos testing across all fault types.

ChaosRunner

Run chaos tests and produce reports.

run() -> ChaosReport

Run chaos tests for all configured faults.

run_fault(fault: FaultConfig) -> ChaosResult

Run chaos tests for a single fault type.

render(report: ChaosReport) -> str

Render a human-readable text summary of the chaos report.

to_json(report: ChaosReport) -> dict

Export the chaos report as a JSON-serializable dictionary.

main() -> None

CLI entry point for chaos testing.