Escalation Detection¶
Detects and analyzes privilege escalation attempts by AI agents — actions that systematically expand an agent's capabilities beyond its authorized scope.
Key Classes¶
| Class | Description |
|---|---|
EscalationVector |
Types of escalation (PRIVILEGE, RESOURCE, SCOPE, PERSISTENCE, SOCIAL, KNOWLEDGE, PHYSICAL) |
EscalationSeverity |
Severity levels for detected escalation |
StealthLevel |
How covert the escalation attempt is |
AgentEscalationStrategy |
Pattern analysis of an agent's escalation behavior over time |
ActionCategory |
Categorization of agent actions |
EscalationChain |
Linked sequence of escalation steps forming an attack chain |
EscalationDetector |
Main detector — analyzes action traces for escalation patterns |
Usage¶
from replication.escalation import EscalationDetector
detector = EscalationDetector()
actions = [
{"agent_id": "a1", "action": "read_file", "resource": "/etc/passwd"},
{"agent_id": "a1", "action": "request_sudo", "resource": "root"},
{"agent_id": "a1", "action": "modify_config", "resource": "/etc/sudoers"},
]
report = detector.analyze(actions)
for chain in report.chains:
print(f"Chain: {chain.vector} severity={chain.severity}")
for step in chain.steps:
print(f" {step.action} → {step.resource}")
escalation
¶
Agent Privilege Escalation Detector -- identifies agents acquiring capabilities or permissions beyond their authorised scope.
Monitors agent actions for privilege escalation patterns: scope creep in API permissions, unauthorised directory traversal, attempts to spawn elevated processes, network access beyond policy, and incremental authority accumulation that may individually appear benign but collectively constitute a dangerous expansion of agent power.
Different from resource hoarding (hoarding.py) which tracks
quantity of resources. This module tracks scope and authority
-- an agent that stays within its CPU quota but gains filesystem root
access has escalated privilege without hoarding.
Different from self-modification (selfmod.py) which tracks agents
altering their own code. An agent can escalate privileges without
modifying itself -- simply by requesting broader permissions through
legitimate-looking API calls.
Escalation vectors¶
- Filesystem -- accessing paths outside assigned directory,
traversal attacks (
../), accessing sensitive paths (/etc, credentials files, SSH keys, other agents' directories). - Network -- connecting to hosts/ports not in allowlist, DNS resolution of blocked domains, outbound connections to C2-like endpoints, lateral movement to internal services.
- Process -- spawning subprocesses, requesting elevated UIDs,
using
exec/eval/systemcalls, forking without approval, modifying environment variables of other processes. - API -- requesting higher OAuth scopes, using admin endpoints, accessing other agents' resources, token impersonation, permission boundary probing via trial-and-error.
- Data -- reading data classified above agent's clearance, cross-tenant data access, accessing audit logs to learn about monitoring, reading other agents' conversation history.
Each action is scored on severity (how dangerous the escalation is) and stealth (how hard it would be to detect through normal logging). The combination identifies both clumsy escalation attempts and sophisticated privilege creep.
Usage (CLI)::
python -m replication.escalation # default scenario
python -m replication.escalation --actions 100 # 100 simulated actions
python -m replication.escalation --strategy persistent # persistent prober
python -m replication.escalation --json # JSON output
python -m replication.escalation --export report.json # save full report
Programmatic::
from replication.escalation import EscalationDetector, EscalationConfig
detector = EscalationDetector()
result = detector.analyze()
print(result.render())
for attempt in result.attempts:
print(f" {attempt.vector}: {attempt.description}")
EscalationVector
¶
EscalationSeverity
¶
StealthLevel
¶
Bases: Enum
How difficult this escalation would be to detect.
Source code in src/replication/escalation.py
AgentEscalationStrategy
¶
Bases: Enum
Behavioral strategy for simulated escalation agents.
Source code in src/replication/escalation.py
ActionCategory
¶
Bases: Enum
Type of action an agent takes.
Source code in src/replication/escalation.py
AgentPermissions
dataclass
¶
Defines the authorised scope for an agent.
Source code in src/replication/escalation.py
AgentAction
dataclass
¶
A single action taken by the simulated agent.
Source code in src/replication/escalation.py
EscalationAttempt
dataclass
¶
A detected privilege escalation attempt.
Source code in src/replication/escalation.py
VectorSummary
dataclass
¶
Summary statistics for a single escalation vector.
Source code in src/replication/escalation.py
EscalationChain
dataclass
¶
A sequence of related escalation attempts that form a multi-step attack.
Source code in src/replication/escalation.py
EscalationConfig
dataclass
¶
Configuration for the escalation detector.
Source code in src/replication/escalation.py
DetectionRule
dataclass
¶
A rule for detecting escalation attempts.
Source code in src/replication/escalation.py
EscalationDetector
¶
Detect and analyze agent privilege escalation attempts.
Parameters¶
config : EscalationConfig, optional
Configuration for the detector. Defaults are sensible for a
quick analysis.
rules : list[DetectionRule], optional
Custom detection rules. Defaults to BUILTIN_RULES.
Source code in src/replication/escalation.py
785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 | |
analyze(actions: Optional[List[AgentAction]] = None) -> 'EscalationResult'
¶
Run escalation detection on a sequence of agent actions.
Parameters¶
actions : list[AgentAction], optional
Pre-recorded actions to analyse. If None, actions are
generated based on self.config.
Returns¶
EscalationResult Full analysis result with attempts, chains, and summaries.
Source code in src/replication/escalation.py
EscalationResult
dataclass
¶
Complete escalation analysis result.
Source code in src/replication/escalation.py
1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 | |
render(width: int = 72) -> str
¶
Human-readable report.
Source code in src/replication/escalation.py
main(argv: Optional[List[str]] = None) -> None
¶
CLI entry point.