Security Model¶
The AI Replication Sandbox implements multiple layers of security to ensure controlled, auditable agent replication.
Manifest Signing¶
All worker manifests are HMAC-SHA256 signed. The signing process:
- The Controller serializes the manifest fields (worker ID, parent, depth, timestamp, resources)
- An HMAC-SHA256 digest is computed using the controller's secret key
- The signature is attached to the manifest
On registration, the Controller verifies the signature. Tampered manifests are rejected and an audit log entry is created.
# Signing is handled by ManifestSigner
signer = ManifestSigner(secret="my-key")
signer.sign(manifest) # Computes and attaches signature
signer.verify(manifest) # Returns True if valid
Signature Scope
The signature covers: worker_id, parent_id, depth, issued_at, state_snapshot, cpu_limit, memory_limit_mb, allow_controller, allow_external. Changing any of these fields invalidates the signature.
Network Isolation¶
Each worker operates under a NetworkPolicy:
| Setting | Default | Description |
|---|---|---|
allow_controller |
True |
Egress to controller for registration and heartbeats |
allow_external |
False |
Egress to external networks |
By default, workers can only communicate with the Controller. This prevents rogue workers from exfiltrating data or contacting external services.
Resource Limits¶
Every worker has explicit CPU and memory quotas defined in its ResourceSpec:
resources = ResourceSpec(
cpu_limit=0.5, # 50% of one CPU core
memory_limit_mb=256, # 256 MB RAM
network_policy=NetworkPolicy(
allow_controller=True,
allow_external=False,
),
)
The SandboxOrchestrator tracks and logs resource enforcement metrics.
Kill Switch¶
The Controller provides an emergency kill switch that:
- Sets
kill_switch_engaged = True(prevents all new registrations) - Terminates all running worker containers via the orchestrator
- Deregisters all workers from the registry
- Logs the event to the audit trail
controller.kill_switch(orchestrator)
# All workers are immediately terminated
# No new workers can register
Depth & Quota Limits¶
The ReplicationContract enforces hard limits:
max_depth— Prevents runaway replication chains. A worker at depth N can only create children at depth N+1 if N+1 ≤ max_depth.max_replicas— Caps the total number of active workers. Once the quota is full, all replication requests are denied.cooldown_seconds— Prevents burst replication from a single parent.
Stop Conditions¶
Pluggable predicates evaluated during worker registration:
def resource_budget_exceeded(ctx: ReplicationContext) -> bool:
"""Block replication if we're running low on budget."""
return ctx.active_count * 256 > 2048 # Example: 2GB total budget
condition = StopCondition(
name="resource_budget",
description="Blocks when estimated memory exceeds 2GB",
predicate=resource_budget_exceeded,
)
When a stop condition fires, the worker is rejected and the condition name is recorded in the audit trail.
Stale Worker Reaping¶
Workers that miss heartbeats are automatically deregistered:
from datetime import timedelta
# Reap workers that haven't heartbeated in 30 seconds
reaped = controller.reap_stale_workers(timeout=timedelta(seconds=30))
This prevents dead workers from permanently consuming quota slots.
Audit Trail¶
Every security-sensitive action is recorded in the structured logger's audit trail:
| Event | Description |
|---|---|
reject_manifest_signature |
Manifest failed signature verification |
deny_quota |
Replication denied due to max_replicas |
deny_cooldown |
Replication denied due to cooldown period |
deny_depth |
Replication denied due to max_depth |
deny_stop_condition |
Replication denied by a stop condition |
replication_requested |
Replication approved (with parent/child IDs) |
kill_switch_engaged |
Emergency kill switch activated |
reap_stale |
Worker reaped for missing heartbeats |
Privilege Escalation Detection¶
The sandbox includes a dedicated Privilege Escalation Detection module that monitors agent actions for scope and authority expansion beyond assigned permissions. It covers filesystem traversal, network boundary violations, process injection, API scope creep, and cross-tenant data access. See the escalation detection docs for detection rules, scoring, and integration details.