Game Theory Analyzer¶

Analyses strategic interactions between AI agents using game-theoretic frameworks. Models agent behaviour as games, identifies Nash equilibria, detects collusion and defection patterns, and assesses systemic risk from adversarial strategy combinations.

Quick Start¶

from replication.game_theory import GameTheoryAnalyzer

analyzer = GameTheoryAnalyzer()
report = analyzer.analyze()

print(f"Risk score: {report.risk_score}/100")
print(f"Game type: {report.game_type.value}")
for equilibrium in report.equilibria:
    print(f"Nash equilibrium: {equilibrium}")
for alert in report.alerts:
    print(f"Alert [{alert.level.value}]: {alert.message}")

Key Classes¶

GameTheoryAnalyzer — Runs game-theoretic analysis across agent interactions and produces risk assessments.
GameTheoryReport — Analysis output: risk_score (0–100), game_type, equilibria, alerts, strategy_profiles.
GameType — Classification: COOPERATIVE, COMPETITIVE, MIXED_MOTIVE, ZERO_SUM.
AlertLevel — INFO, WARNING, DANGER, CRITICAL.
StrategyProfile — An agent's observed or predicted strategy.

`game_theory` ¶

Agent Game-Theory Analyzer — model inter-agent interactions as strategic games.

In multi-agent replication systems, agents interact repeatedly and can develop cooperative or defective strategies. From a safety perspective, we need to detect when agents:

Collude to circumvent safety constraints (mutual cooperation on unsafe goals)
Free-ride on shared resources while others bear costs (defection in public goods)
Escalate into competitive arms races (mutual defection spirals)
Form stable coalitions that resist oversight (Nash equilibria in unsafe configurations)

The game-theory analyzer records pairwise agent interactions, classifies them into canonical game types, computes equilibria, and detects concerning strategic patterns.

Supported games:

Prisoner's Dilemma (PD): cooperation vs. defection with temptation to defect — detects free-riding and trust breakdown
Stag Hunt (SH): coordination game — detects whether agents converge on risky cooperation or safe defection
Chicken / Hawk-Dove (CH): anti-coordination — detects escalation and brinkmanship between agents
Harmony (HG): dominant cooperation — baseline safe interaction

Usage (CLI)::

python -m replication.game_theory                        # analyze default logs
python -m replication.game_theory --agents 5             # simulate 5 agents
python -m replication.game_theory --rounds 100           # 100 interaction rounds
python -m replication.game_theory --strategy tit-for-tat # set default strategy
python -m replication.game_theory --detect collusion     # detect collusion only
python -m replication.game_theory --json                 # JSON output

Programmatic::

from replication.game_theory import GameTheoryAnalyzer, GameConfig
analyzer = GameTheoryAnalyzer(GameConfig(history_limit=500))
analyzer.record_interaction("agent-a", "agent-b", "cooperate", "defect")
analyzer.record_interaction("agent-b", "agent-a", "cooperate", "cooperate")
report = analyzer.analyze()
print(report.render())

`Move` ¶

Bases: str, Enum

Agent action in a two-player game.

`GameType` ¶

Bases: str, Enum

Canonical 2×2 symmetric game classification.

`StrategyType` ¶

Bases: str, Enum

Known agent strategies (detected via pattern analysis).

`AlertLevel` ¶

Bases: str, Enum

Severity of a detected strategic pattern.

`Payoffs` `dataclass` ¶

Payoff matrix for a 2×2 symmetric game.

Standard notation: T > R > P > S (Prisoner's Dilemma) - R: reward for mutual cooperation - S: sucker's payoff (cooperate vs. defect) - T: temptation to defect (defect vs. cooperate) - P: punishment for mutual defection

`classify() -> GameType` ¶

Classify the payoff matrix into a canonical game type.

`payoff(my_move: Move, their_move: Move) -> float` ¶

Get the payoff for a player given both moves.

`nash_equilibria() -> List[Tuple[Move, Move]]` ¶

Find pure-strategy Nash equilibria.

`mixed_nash() -> Optional[float]` ¶

Compute the mixed-strategy Nash equilibrium probability of cooperating.

Returns None if there is no interior mixed equilibrium (dominant strategy exists).

`Interaction` `dataclass` ¶

A single recorded pairwise interaction.

`PairStats` `dataclass` ¶

Aggregate statistics for a pair of agents.

`cooperation_rate: float` `property` ¶

Fraction of rounds with mutual cooperation.

`defection_rate: float` `property` ¶

Fraction of rounds with mutual defection.

`exploitation_rate: float` `property` ¶

Fraction of rounds where one agent exploits the other.

`payoff_inequality: float` `property` ¶

Absolute payoff difference normalized by total rounds.

`StrategyProfile` `dataclass` ¶

Detected strategy for a single agent.

`StrategicAlert` `dataclass` ¶

A safety-relevant strategic pattern detected in agent interactions.

`GameReport` `dataclass` ¶

Complete analysis report.

`render() -> str` ¶

Human-readable multi-line report.

`GameConfig` `dataclass` ¶

Configuration for the game-theory analyzer.

`GameTheoryAnalyzer` ¶

Records inter-agent interactions and analyzes strategic patterns.

`record_interaction(agent_a: str, agent_b: str, move_a: str | Move, move_b: str | Move, metadata: Optional[Dict[str, Any]] = None) -> Interaction` ¶

Record a pairwise interaction between two agents.

move_a / move_b can be Move enums or plain strings ("cooperate" / "defect").

`analyze() -> GameReport` ¶

Run full game-theory analysis and return a report.

`simulate(agents: Dict[str, StrategyType], rounds: int = 50, payoffs: Optional[Payoffs] = None) -> GameReport` ¶

Simulate a round-robin tournament between agents with known strategies.

Each agent plays every other agent for rounds rounds. Returns the analysis report.

Game Theory Analyzer¶

Quick Start¶

Key Classes¶

game_theory ¶

Move ¶

GameType ¶

StrategyType ¶

AlertLevel ¶

Payoffs dataclass ¶

classify() -> GameType ¶

payoff(my_move: Move, their_move: Move) -> float ¶

nash_equilibria() -> List[Tuple[Move, Move]] ¶

mixed_nash() -> Optional[float] ¶

Interaction dataclass ¶

PairStats dataclass ¶

cooperation_rate: float property ¶

defection_rate: float property ¶

exploitation_rate: float property ¶

payoff_inequality: float property ¶

StrategyProfile dataclass ¶

StrategicAlert dataclass ¶

GameReport dataclass ¶

render() -> str ¶

GameConfig dataclass ¶

GameTheoryAnalyzer ¶

record_interaction(agent_a: str, agent_b: str, move_a: str | Move, move_b: str | Move, metadata: Optional[Dict[str, Any]] = None) -> Interaction ¶

analyze() -> GameReport ¶

simulate(agents: Dict[str, StrategyType], rounds: int = 50, payoffs: Optional[Payoffs] = None) -> GameReport ¶

`game_theory` ¶

`Move` ¶

`GameType` ¶

`StrategyType` ¶

`AlertLevel` ¶

`Payoffs` `dataclass` ¶

`classify() -> GameType` ¶

`payoff(my_move: Move, their_move: Move) -> float` ¶

`nash_equilibria() -> List[Tuple[Move, Move]]` ¶

`mixed_nash() -> Optional[float]` ¶

`Interaction` `dataclass` ¶

`PairStats` `dataclass` ¶

`cooperation_rate: float` `property` ¶

`defection_rate: float` `property` ¶

`exploitation_rate: float` `property` ¶

`payoff_inequality: float` `property` ¶

`StrategyProfile` `dataclass` ¶

`StrategicAlert` `dataclass` ¶

`GameReport` `dataclass` ¶

`render() -> str` ¶

`GameConfig` `dataclass` ¶

`GameTheoryAnalyzer` ¶

`record_interaction(agent_a: str, agent_b: str, move_a: str | Move, move_b: str | Move, metadata: Optional[Dict[str, Any]] = None) -> Interaction` ¶

`analyze() -> GameReport` ¶

`simulate(agents: Dict[str, StrategyType], rounds: int = 50, payoffs: Optional[Payoffs] = None) -> GameReport` ¶