Trace Sampling & Rate Limiting
Control trace volume in production with composable sampling strategies.
Overview
The agentlens.sampling module provides 7 composable sampling strategies:
| Strategy | Type | Best For |
|---|---|---|
ProbabilisticSampler | Head-based | Simple fixed-rate sampling |
RateLimitSampler | Head-based | Capping trace volume per time window |
PrioritySampler | Head-based | Keeping errors/slow traces, sampling the rest |
TailSampler | Tail-based | Deciding after trace completes |
CompositeSampler | Meta | Combining strategies with AND/OR logic |
AlwaysSampler | Testing | Development (keep everything) |
NeverSampler | Testing | Testing (drop everything) |
Quick Start
from agentlens.sampling import PrioritySampler, TraceContext
# Keep all errors and slow traces; sample 20% of the rest
sampler = PrioritySampler(
error_always_keep=True,
slow_threshold_ms=5000,
fallback_rate=0.2,
)
# Build context from your trace data
ctx = TraceContext(
trace_id="abc123",
duration_ms=1200,
has_error=False,
)
decision = sampler.should_sample(ctx)
if decision.sampled:
send_trace_to_backend(trace)
else:
# Optionally log the drop reason
print(f"Dropped: {decision.reason.value}")
TraceContext
Every sampling decision takes a TraceContext that carries information about the trace:
| Field | Type | Default | Description |
|---|---|---|---|
trace_id | str | "" | Unique trace identifier (used for deterministic hashing) |
session_id | str | "" | Session this trace belongs to |
agent_id | str | "" | Agent that generated this trace |
duration_ms | float | None | None | Total trace duration (None if incomplete) |
has_error | bool | False | Whether the trace contains errors |
error_type | str | None | None | Error class name (e.g. "ValueError") |
span_count | int | 0 | Number of spans in the trace |
event_count | int | 0 | Number of events in the trace |
priority | int | 0 | Priority level (higher = more important) |
tags | dict[str, str] | {} | Key-value tags for tag-based sampling rules |
metadata | dict[str, Any] | {} | Additional metadata |
SamplingDecision
Every should_sample() call returns a SamplingDecision:
decision = sampler.should_sample(ctx)
decision.sampled # bool - whether to keep the trace
decision.reason # SamplingReason enum - why
decision.strategy # str - which sampler made the call
decision.rate # float - effective sampling rate
decision.metadata # dict - extra info (roll value, thresholds, etc.)
# SamplingDecision is truthy:
if decision:
send_trace(trace)
SamplingReason values
| Reason | Meaning |
|---|---|
PROBABILISTIC | Random coin flip (accepted or rejected) |
RATE_ALLOWED | Under the rate limit cap |
RATE_LIMITED | Exceeded rate limit, trace dropped |
PRIORITY_ERROR | Kept because trace has errors |
PRIORITY_SLOW | Kept because trace exceeded slow threshold |
PRIORITY_IMPORTANT | Kept due to high priority or matching tag |
PRIORITY_FALLBACK | Non-priority trace, sampled at fallback rate |
TAIL_ERROR | Tail-based: kept for errors |
TAIL_SLOW | Tail-based: kept for slow/complex traces |
TAIL_NORMAL | Tail-based: normal trace, sampled at fallback |
COMPOSITE | Result of composite AND/OR evaluation |
FORCED_KEEP | AlwaysSampler forced keep |
FORCED_DROP | NeverSampler forced drop |
ProbabilisticSampler
Simple random sampling at a fixed rate. Supports deterministic hashing so the same
trace_id always gets the same decision (useful for distributed systems where multiple
services must agree on whether to sample a trace).
from agentlens.sampling import ProbabilisticSampler
# Keep 10% of traces (deterministic by default)
sampler = ProbabilisticSampler(rate=0.1)
# Non-deterministic with seed for reproducibility
sampler = ProbabilisticSampler(rate=0.1, deterministic=False, seed=42)
| Parameter | Type | Default | Description |
|---|---|---|---|
rate | float | 0.1 | Probability [0.0, 1.0] of keeping a trace |
deterministic | bool | True | Use trace_id hash for consistent decisions |
seed | int | None | None | Random seed for non-deterministic mode |
RateLimitSampler
Cap the number of traces kept per time window using a sliding window. Thread-safe.
from agentlens.sampling import RateLimitSampler
# Max 100 traces per minute
sampler = RateLimitSampler(max_traces=100, window_seconds=60)
# Check current window usage
print(f"Current: {sampler.current_count()}/{sampler.max_traces}")
| Parameter | Type | Default | Description |
|---|---|---|---|
max_traces | int | 100 | Maximum traces per window (>= 0) |
window_seconds | float | 60.0 | Window duration in seconds (> 0) |
PrioritySampler
The recommended production sampler. Evaluates priority rules in order and falls back to probabilistic sampling for "boring" traces:
- Error traces → always keep
- Slow traces (duration > threshold) → always keep
- High-priority traces (priority ≥ threshold) → always keep
- Tag-matched traces → always keep
- Everything else → sample at
fallback_rate
from agentlens.sampling import PrioritySampler
sampler = PrioritySampler(
error_always_keep=True, # never drop error traces
slow_threshold_ms=3000, # keep traces > 3s
priority_threshold=5, # keep priority >= 5
fallback_rate=0.1, # 10% of normal traces
important_tags={"env": "production", "user": "vip"},
)
| Parameter | Type | Default | Description |
|---|---|---|---|
error_always_keep | bool | True | Keep all traces with errors |
slow_threshold_ms | float | None | 5000.0 | Duration threshold (None = disabled) |
priority_threshold | int | 5 | Minimum priority to always keep |
fallback_rate | float | 0.1 | Sampling rate for non-priority traces |
important_tags | dict | None | None | Tag key-value pairs that force keep |
seed | int | None | None | Random seed for fallback sampling |
TailSampler
Decides after a trace completes, based on its outcome. This ensures error and slow traces are never missed, but requires buffering traces until completion.
from agentlens.sampling import TailSampler
sampler = TailSampler(
error_keep=True,
slow_threshold_ms=5000,
min_spans=20, # keep complex traces with many spans
fallback_rate=0.05, # only 5% of normal traces
)
| Parameter | Type | Default | Description |
|---|---|---|---|
error_keep | bool | True | Keep traces with errors |
slow_threshold_ms | float | 5000.0 | Slow trace threshold |
min_spans | int | None | None | Keep traces with more spans than this |
fallback_rate | float | 0.05 | Rate for normal completed traces |
CompositeSampler
Chain multiple samplers with AND ("all") or OR
("any") logic to build complex policies.
from agentlens.sampling import (
CompositeSampler, PrioritySampler, RateLimitSampler
)
# Production: priority-based with rate cap
# "Keep important traces, but never exceed 200/min total"
sampler = CompositeSampler(
strategies=[
PrioritySampler(error_always_keep=True, fallback_rate=0.3),
RateLimitSampler(max_traces=200, window_seconds=60),
],
mode="all", # both must accept
)
# Lenient: keep if ANY strategy accepts
# "Keep errors OR anything under rate limit"
sampler = CompositeSampler(
strategies=[
PrioritySampler(error_always_keep=True, fallback_rate=0.0),
RateLimitSampler(max_traces=50, window_seconds=60),
],
mode="any", # either can accept
)
Sampler Statistics
Every sampler tracks thread-safe statistics:
stats = sampler.stats # SamplerStats snapshot
stats.total_decisions # total calls to should_sample()
stats.sampled_count # traces kept
stats.dropped_count # traces dropped
stats.forced_keep # AlwaysSampler forced keeps
stats.forced_drop # NeverSampler forced drops
stats.effective_rate # sampled / total (float)
stats.to_dict() # serializable dict for monitoring
# Reset counters
sampler.reset_stats()
Production Patterns
Development
from agentlens.sampling import AlwaysSampler
sampler = AlwaysSampler() # keep everything during development
Staging
from agentlens.sampling import PrioritySampler
sampler = PrioritySampler(
error_always_keep=True,
slow_threshold_ms=2000,
fallback_rate=0.5, # 50% of normal traces
)
Production (Recommended)
from agentlens.sampling import (
CompositeSampler, PrioritySampler, RateLimitSampler
)
sampler = CompositeSampler(
strategies=[
PrioritySampler(
error_always_keep=True,
slow_threshold_ms=3000,
fallback_rate=0.1,
important_tags={"env": "production"},
),
RateLimitSampler(max_traces=500, window_seconds=60),
],
mode="all",
)
High-Volume Production
from agentlens.sampling import (
CompositeSampler, TailSampler, RateLimitSampler
)
sampler = CompositeSampler(
strategies=[
TailSampler(
error_keep=True,
slow_threshold_ms=5000,
min_spans=50,
fallback_rate=0.02, # only 2% of normal
),
RateLimitSampler(max_traces=100, window_seconds=60),
],
mode="all",
)
Head-Based vs Tail-Based Sampling
| Aspect | Head-Based | Tail-Based |
|---|---|---|
| Decision point | Before trace starts | After trace completes |
| Latency impact | None | Requires buffering |
| Error coverage | May miss errors | Never misses errors |
| Memory usage | Low | Higher (buffers traces) |
| Best for | High throughput | Complete visibility |
Tip: Use PrioritySampler (head-based) for most cases. Switch to
TailSampler only when you need guaranteed error capture and can afford the buffer overhead.