Session Replay

Step through agent sessions event-by-event for debugging, demos, and code reviews.

Module: agentlens.replayer
Reconstructs timing from recorded events with speed control, filtering, breakpoints, callbacks, and multiple export formats.

Quick Start

from agentlens.replayer import SessionReplayer

replayer = SessionReplayer(session)
replayer.set_speed(2.0)  # 2x playback

for frame in replayer.play():
    print(frame.to_text())

# Print aggregate stats
print(replayer.stats.summary())

Core Concepts

ReplayFrame

Each frame wraps a single AgentEvent with replay metadata:

FieldTypeDescription
indexintZero-based position in the replay
totalintTotal number of frames
eventAgentEventThe underlying event
wall_delay_msfloatDelay since previous frame (speed-adjusted)
elapsed_msfloatCumulative elapsed time (original timeline)
is_breakpointboolWhether a breakpoint was triggered
annotationslist[str]User-added notes for this event
progressfloat0.0–1.0 progress (property)
progress_pctfloatPercentage progress (property)

ReplayStats

Aggregate statistics computed during replay:

FieldDescription
total_eventsTotal events in the session
played_eventsEvents actually played (after filtering)
filtered_eventsEvents excluded by filters
breakpoints_hitNumber of breakpoints triggered
original_duration_msSpan of original session timeline
replay_duration_msTotal wall-clock delay at playback speed
event_type_countsCount of each event type played
models_usedSet of model names encountered
tools_usedSet of tool names encountered

Speed Control

Adjust playback speed to slow down for analysis or speed up for overview:

# Constructor
replayer = SessionReplayer(session, speed=0.5)  # Half speed

# Or set later (chainable)
replayer.set_speed(4.0)  # 4x speed

Speed affects wall_delay_ms on each frame. At 2x speed, delays are halved. The elapsed_ms field always reflects the original timeline.

Filtering

Focus on specific event types with include and exclude filters:

# Include only LLM calls and decisions
replayer.add_filter("llm_call", "decision")

# Exclude errors
replayer.exclude("error")

# Remove a specific filter
replayer.remove_filter("decision")

# Clear all filters
replayer.clear_filters()

Include filters (allowlist) and exclude filters (blocklist) can be combined. Exclude takes precedence.

Breakpoints

Add conditional breakpoints to pause at interesting events:

# Break on errors
replayer.add_breakpoint(lambda e: e.event_type == "error")

# Break on expensive calls
replayer.add_breakpoint(lambda e: e.tokens_out > 2000)

# Break on specific tools
replayer.add_breakpoint(lambda e: e.tool_call and e.tool_call.tool_name == "execute_code")

for frame in replayer.play():
    print(frame.to_text())
    if frame.is_breakpoint:
        input("Breakpoint hit — press Enter to continue")

# Clear all breakpoints
replayer.clear_breakpoints()

Callbacks

Register functions called for every frame during replay:

# Log to a file
def log_frame(frame):
    with open("replay.log", "a") as f:
        f.write(frame.to_text() + "\n")

replayer.on_frame(log_frame)

# Track progress
replayer.on_frame(lambda f: print(f"Progress: {f.progress_pct}%"))

Annotations

Add notes to specific events before replay:

# Annotate by event ID
replayer.annotate("evt-abc123", "This is where the bug starts")
replayer.annotate("evt-abc123", "Check the tool output here")
replayer.annotate("evt-def456", "Recovery point")

Annotations appear in frame output and exports.

Step-Through Debugging

Step through events one at a time instead of streaming:

replayer = SessionReplayer(session)

# Step forward one frame at a time
frame = replayer.step()
while frame is not None:
    print(frame.to_text())
    frame = replayer.step()

# Seek to a specific position
replayer.seek(10)
frame = replayer.step()  # Gets frame at index 10

# Reset to beginning
replayer.reset()

Range Replay

Replay a specific slice of the session:

# Replay events 5 through 15
for frame in replayer.play_range(start=5, end=15):
    print(frame.to_text())

Export Formats

JSON

json_str = replayer.to_json(indent=2)

# Contains: session_id, agent_name, speed, frames[], stats

Plain Text

text = replayer.to_text()
# Human-readable timeline with stats summary

Markdown

md = replayer.to_markdown()
# Formatted table with timeline + stats code block
# Great for pasting into PRs, docs, or team reports

Session Comparison

Compare two sessions to find structural differences:

diff = SessionReplayer.diff(session_a, session_b)

print(f"Events: {diff['event_count']['a']} vs {diff['event_count']['b']}")
print(f"Duration: {diff['duration_ms']['a']:.0f}ms vs {diff['duration_ms']['b']:.0f}ms")
print(f"Tokens in: {diff['tokens']['a']['in']} vs {diff['tokens']['b']['in']}")

# Event type distribution
for etype, counts in diff['event_types'].items():
    print(f"  {etype}: {counts['a']} vs {counts['b']}")

Useful for comparing agent runs before and after changes.

Method Chaining

Most configuration methods return self for fluent chaining:

frames = list(
    SessionReplayer(session)
    .set_speed(2.0)
    .add_filter("llm_call", "tool_call")
    .exclude("error")
    .add_breakpoint(lambda e: e.tokens_out > 1000)
    .annotate("evt-start", "Session begins")
    .on_frame(lambda f: None)  # Logging callback
    .play()
)

Use Cases

ScenarioApproach
Debugging Add breakpoints on errors, step through with step()
Code review Export as Markdown, paste into PR description
Demo Use set_speed(0.5) for slow-motion walkthrough
Regression testing Use diff() to compare before/after sessions
Cost analysis Filter to llm_call, check token counts in stats
Incident review Annotate key events, export JSON for the post-mortem

API Reference

SessionReplayer

MethodReturnsDescription
play()Iterator[ReplayFrame]Stream all frames
play_range(start, end)Iterator[ReplayFrame]Stream a slice of frames
step()ReplayFrame | NoneAdvance one frame
seek(position)selfSet step position
reset()selfReset to beginning
set_speed(speed)selfSet playback speed multiplier
add_filter(*types)selfInclude only these event types
exclude(*types)selfExclude these event types
clear_filters()selfRemove all filters
add_breakpoint(fn)selfAdd a conditional breakpoint
clear_breakpoints()selfRemove all breakpoints
on_frame(callback)selfRegister a per-frame callback
annotate(event_id, note)selfAdd a note to an event
to_json(indent)strExport as JSON
to_text()strExport as plain text
to_markdown()strExport as Markdown table
diff(a, b)dictCompare two sessions (static)