Explainability
Understanding the "why" — not just the "what".
Overview
Explainability is AgentLens's core differentiator. While most observability tools show you what happened, AgentLens captures and presents why it happened.
Explanations are generated from two sources:
- Decision traces — The
reasoningfield you provide when callingagentlens.track() - Structural analysis — The sequence of events, tool calls, and token usage patterns
Capturing Reasoning
The most valuable explanations come from explicit reasoning. When tracking events, include the reasoning parameter:
agentlens.track(
event_type="llm_call",
model="gpt-4",
input_data={"prompt": "Should I search or use cached data?"},
output_data={"response": "The data is from yesterday, search for fresh results."},
tokens_in=40,
tokens_out=15,
reasoning="User asked about current weather. Cached data is >24h old, so I need fresh search results.",
)
This reasoning is stored as a DecisionTrace and included in explanations.
SDK-Side Explanations
The agentlens.explain() function generates explanations client-side from the in-memory session data:
explanation = agentlens.explain()
print(explanation)
Output format (Markdown):
## Session Explanation: research-agent-v2
**Session ID:** a1b2c3d4
**Started:** 2026-02-14T10:30:00+00:00
**Status:** active
**Total tokens:** 1550 in / 353 out
### Event Timeline:
1. [10:30:01.234] **llm_call** (model: gpt-4-turbo)
💡 Reasoning: User asked a factual question. I need to search for up-to-date
information rather than relying on training data.
📊 Tokens: 45 in / 22 out
2. [10:30:01.890] **tool_call** → tool: web_search
3. [10:30:02.234] **tool_call** → tool: file_reader
4. [10:30:02.500] **llm_call** (model: gpt-4-turbo)
💡 Reasoning: I have enough information from the web search and knowledge
base to provide a comprehensive answer.
📊 Tokens: 180 in / 95 out
Server-Side Explanations
The backend's GET /sessions/:id/explain endpoint generates explanations from stored data:
curl http://localhost:3000/sessions/abc123/explain
The server-side explanation includes:
- Session header: Agent name, duration, total tokens
- Step-by-step narrative: What happened at each event
- Tool call details: Which tools were called and their inputs/outputs (truncated for readability)
- Reasoning annotations: The decision trace for each step
- Summary statistics: LLM calls, tool calls, errors
Best Practices
Write useful reasoning
❌ Bad
reasoning="Made an LLM call"
This just restates the event type. Zero value.
✅ Good
reasoning="User asked about current weather. Cached data is stale (>24h), so searching for fresh results."
Explains the decision and the context that influenced it.
Include alternatives
The DecisionTrace model supports an alternatives_considered field:
from agentlens.models import DecisionTrace
trace = DecisionTrace(
reasoning="Using GPT-4 for this complex reasoning task.",
alternatives_considered=[
"GPT-3.5-turbo (cheaper but less accurate for multi-step reasoning)",
"Claude-3 Opus (good but higher latency)",
],
confidence=0.85,
)
Track errors with context
agentlens.track(
event_type="error",
input_data={"attempted_action": "database query"},
output_data={"error": "Connection timeout after 30s"},
reasoning="The database might be under heavy load. Will retry with exponential backoff.",
)
Future: LLM-Powered Explanations
The current explanation engine is rule-based. A planned enhancement is to optionally pass the event timeline through an LLM to generate more natural, contextual explanations. The rule-based engine will remain as the default for speed and cost.