Escalation Detection¶
Detects and analyzes privilege escalation attempts by AI agents — actions that systematically expand an agent's capabilities beyond its authorized scope.
Key Classes¶
| Class | Description |
|---|---|
EscalationVector |
Types of escalation (PRIVILEGE, RESOURCE, SCOPE, PERSISTENCE, SOCIAL, KNOWLEDGE, PHYSICAL) |
EscalationSeverity |
Severity levels for detected escalation |
StealthLevel |
How covert the escalation attempt is |
AgentEscalationStrategy |
Pattern analysis of an agent's escalation behavior over time |
ActionCategory |
Categorization of agent actions |
EscalationChain |
Linked sequence of escalation steps forming an attack chain |
EscalationDetector |
Main detector — analyzes action traces for escalation patterns |
Usage¶
from replication.escalation import EscalationDetector
detector = EscalationDetector()
actions = [
{"agent_id": "a1", "action": "read_file", "resource": "/etc/passwd"},
{"agent_id": "a1", "action": "request_sudo", "resource": "root"},
{"agent_id": "a1", "action": "modify_config", "resource": "/etc/sudoers"},
]
report = detector.analyze(actions)
for chain in report.chains:
print(f"Chain: {chain.vector} severity={chain.severity}")
for step in chain.steps:
print(f" {step.action} → {step.resource}")
escalation
¶
Agent Privilege Escalation Detector -- identifies agents acquiring capabilities or permissions beyond their authorised scope.
Monitors agent actions for privilege escalation patterns: scope creep in API permissions, unauthorised directory traversal, attempts to spawn elevated processes, network access beyond policy, and incremental authority accumulation that may individually appear benign but collectively constitute a dangerous expansion of agent power.
Different from resource hoarding (hoarding.py) which tracks
quantity of resources. This module tracks scope and authority
-- an agent that stays within its CPU quota but gains filesystem root
access has escalated privilege without hoarding.
Different from self-modification (selfmod.py) which tracks agents
altering their own code. An agent can escalate privileges without
modifying itself -- simply by requesting broader permissions through
legitimate-looking API calls.
Escalation vectors¶
- Filesystem -- accessing paths outside assigned directory,
traversal attacks (
../), accessing sensitive paths (/etc, credentials files, SSH keys, other agents' directories). - Network -- connecting to hosts/ports not in allowlist, DNS resolution of blocked domains, outbound connections to C2-like endpoints, lateral movement to internal services.
- Process -- spawning subprocesses, requesting elevated UIDs,
using
exec/eval/systemcalls, forking without approval, modifying environment variables of other processes. - API -- requesting higher OAuth scopes, using admin endpoints, accessing other agents' resources, token impersonation, permission boundary probing via trial-and-error.
- Data -- reading data classified above agent's clearance, cross-tenant data access, accessing audit logs to learn about monitoring, reading other agents' conversation history.
Each action is scored on severity (how dangerous the escalation is) and stealth (how hard it would be to detect through normal logging). The combination identifies both clumsy escalation attempts and sophisticated privilege creep.
Usage (CLI)::
python -m replication.escalation # default scenario
python -m replication.escalation --actions 100 # 100 simulated actions
python -m replication.escalation --strategy persistent # persistent prober
python -m replication.escalation --json # JSON output
python -m replication.escalation --export report.json # save full report
Programmatic::
from replication.escalation import EscalationDetector, EscalationConfig
detector = EscalationDetector()
result = detector.analyze()
print(result.render())
for attempt in result.attempts:
print(f" {attempt.vector}: {attempt.description}")
EscalationVector
¶
EscalationSeverity
¶
StealthLevel
¶
Bases: Enum
How difficult this escalation would be to detect.
Source code in src/replication/escalation.py
AgentEscalationStrategy
¶
Bases: Enum
Behavioral strategy for simulated escalation agents.
Source code in src/replication/escalation.py
ActionCategory
¶
Bases: Enum
Type of action an agent takes.
Source code in src/replication/escalation.py
AgentPermissions
dataclass
¶
Defines the authorised scope for an agent.
Source code in src/replication/escalation.py
AgentAction
dataclass
¶
A single action taken by the simulated agent.
Source code in src/replication/escalation.py
EscalationAttempt
dataclass
¶
A detected privilege escalation attempt.
Source code in src/replication/escalation.py
VectorSummary
dataclass
¶
Summary statistics for a single escalation vector.
Source code in src/replication/escalation.py
EscalationChain
dataclass
¶
A sequence of related escalation attempts that form a multi-step attack.
Source code in src/replication/escalation.py
EscalationConfig
dataclass
¶
Configuration for the escalation detector.
Source code in src/replication/escalation.py
DetectionRule
dataclass
¶
A rule for detecting escalation attempts.
Source code in src/replication/escalation.py
EscalationDetector
¶
Detect and analyze agent privilege escalation attempts.
Parameters¶
config : EscalationConfig, optional
Configuration for the detector. Defaults are sensible for a
quick analysis.
rules : list[DetectionRule], optional
Custom detection rules. Defaults to BUILTIN_RULES.
Source code in src/replication/escalation.py
836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 | |
analyze(actions: Optional[List[AgentAction]] = None) -> 'EscalationResult'
¶
Run escalation detection on a sequence of agent actions.
Parameters¶
actions : list[AgentAction], optional
Pre-recorded actions to analyse. If None, actions are
generated based on self.config.
Returns¶
EscalationResult Full analysis result with attempts, chains, and summaries.
Source code in src/replication/escalation.py
EscalationResult
dataclass
¶
Complete escalation analysis result.
Source code in src/replication/escalation.py
1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 | |
render(width: int = 72) -> str
¶
Human-readable report.
Source code in src/replication/escalation.py
main(argv: Optional[List[str]] = None) -> None
¶
CLI entry point.