Trace Sampling & Rate Limiting

Control trace volume in production with composable sampling strategies.

Why sample? In production, capturing every trace creates storage and performance overhead. Sampling lets you keep interesting traces (errors, slow operations) while dropping routine ones, reducing costs by 80-95% with minimal observability loss.

Overview

The agentlens.sampling module provides 7 composable sampling strategies:

Strategy	Type	Best For
`ProbabilisticSampler`	Head-based	Simple fixed-rate sampling
`RateLimitSampler`	Head-based	Capping trace volume per time window
`PrioritySampler`	Head-based	Keeping errors/slow traces, sampling the rest
`TailSampler`	Tail-based	Deciding after trace completes
`CompositeSampler`	Meta	Combining strategies with AND/OR logic
`AlwaysSampler`	Testing	Development (keep everything)
`NeverSampler`	Testing	Testing (drop everything)

Quick Start

from agentlens.sampling import PrioritySampler, TraceContext

# Keep all errors and slow traces; sample 20% of the rest
sampler = PrioritySampler(
    error_always_keep=True,
    slow_threshold_ms=5000,
    fallback_rate=0.2,
)

# Build context from your trace data
ctx = TraceContext(
    trace_id="abc123",
    duration_ms=1200,
    has_error=False,
)

decision = sampler.should_sample(ctx)
if decision.sampled:
    send_trace_to_backend(trace)
else:
    # Optionally log the drop reason
    print(f"Dropped: {decision.reason.value}")

TraceContext

Every sampling decision takes a TraceContext that carries information about the trace:

Field	Type	Default	Description
`trace_id`	`str`	`""`	Unique trace identifier (used for deterministic hashing)
`session_id`	`str`	`""`	Session this trace belongs to
`agent_id`	`str`	`""`	Agent that generated this trace
`duration_ms`	`float \| None`	`None`	Total trace duration (None if incomplete)
`has_error`	`bool`	`False`	Whether the trace contains errors
`error_type`	`str \| None`	`None`	Error class name (e.g. "ValueError")
`span_count`	`int`	`0`	Number of spans in the trace
`event_count`	`int`	`0`	Number of events in the trace
`priority`	`int`	`0`	Priority level (higher = more important)
`tags`	`dict[str, str]`	`{}`	Key-value tags for tag-based sampling rules
`metadata`	`dict[str, Any]`	`{}`	Additional metadata

SamplingDecision

Every should_sample() call returns a SamplingDecision:

decision = sampler.should_sample(ctx)

decision.sampled   # bool  - whether to keep the trace
decision.reason    # SamplingReason enum - why
decision.strategy  # str   - which sampler made the call
decision.rate      # float - effective sampling rate
decision.metadata  # dict  - extra info (roll value, thresholds, etc.)

# SamplingDecision is truthy:
if decision:
    send_trace(trace)

SamplingReason values

Reason	Meaning
`PROBABILISTIC`	Random coin flip (accepted or rejected)
`RATE_ALLOWED`	Under the rate limit cap
`RATE_LIMITED`	Exceeded rate limit, trace dropped
`PRIORITY_ERROR`	Kept because trace has errors
`PRIORITY_SLOW`	Kept because trace exceeded slow threshold
`PRIORITY_IMPORTANT`	Kept due to high priority or matching tag
`PRIORITY_FALLBACK`	Non-priority trace, sampled at fallback rate
`TAIL_ERROR`	Tail-based: kept for errors
`TAIL_SLOW`	Tail-based: kept for slow/complex traces
`TAIL_NORMAL`	Tail-based: normal trace, sampled at fallback
`COMPOSITE`	Result of composite AND/OR evaluation
`FORCED_KEEP`	AlwaysSampler forced keep
`FORCED_DROP`	NeverSampler forced drop

ProbabilisticSampler

Simple random sampling at a fixed rate. Supports deterministic hashing so the same trace_id always gets the same decision (useful for distributed systems where multiple services must agree on whether to sample a trace).

from agentlens.sampling import ProbabilisticSampler

# Keep 10% of traces (deterministic by default)
sampler = ProbabilisticSampler(rate=0.1)

# Non-deterministic with seed for reproducibility
sampler = ProbabilisticSampler(rate=0.1, deterministic=False, seed=42)

Parameter	Type	Default	Description
`rate`	`float`	`0.1`	Probability [0.0, 1.0] of keeping a trace
`deterministic`	`bool`	`True`	Use trace_id hash for consistent decisions
`seed`	`int \| None`	`None`	Random seed for non-deterministic mode

RateLimitSampler

Cap the number of traces kept per time window using a sliding window. Thread-safe.

from agentlens.sampling import RateLimitSampler

# Max 100 traces per minute
sampler = RateLimitSampler(max_traces=100, window_seconds=60)

# Check current window usage
print(f"Current: {sampler.current_count()}/{sampler.max_traces}")

Parameter	Type	Default	Description
`max_traces`	`int`	`100`	Maximum traces per window (>= 0)
`window_seconds`	`float`	`60.0`	Window duration in seconds (> 0)

PrioritySampler

The recommended production sampler. Evaluates priority rules in order and falls back to probabilistic sampling for "boring" traces:

Error traces → always keep
Slow traces (duration > threshold) → always keep
High-priority traces (priority ≥ threshold) → always keep
Tag-matched traces → always keep
Everything else → sample at fallback_rate

from agentlens.sampling import PrioritySampler

sampler = PrioritySampler(
    error_always_keep=True,        # never drop error traces
    slow_threshold_ms=3000,        # keep traces > 3s
    priority_threshold=5,          # keep priority >= 5
    fallback_rate=0.1,             # 10% of normal traces
    important_tags={"env": "production", "user": "vip"},
)

Parameter	Type	Default	Description
`error_always_keep`	`bool`	`True`	Keep all traces with errors
`slow_threshold_ms`	`float \| None`	`5000.0`	Duration threshold (None = disabled)
`priority_threshold`	`int`	`5`	Minimum priority to always keep
`fallback_rate`	`float`	`0.1`	Sampling rate for non-priority traces
`important_tags`	`dict \| None`	`None`	Tag key-value pairs that force keep
`seed`	`int \| None`	`None`	Random seed for fallback sampling

TailSampler

Decides after a trace completes, based on its outcome. This ensures error and slow traces are never missed, but requires buffering traces until completion.

from agentlens.sampling import TailSampler

sampler = TailSampler(
    error_keep=True,
    slow_threshold_ms=5000,
    min_spans=20,          # keep complex traces with many spans
    fallback_rate=0.05,    # only 5% of normal traces
)

Parameter	Type	Default	Description
`error_keep`	`bool`	`True`	Keep traces with errors
`slow_threshold_ms`	`float`	`5000.0`	Slow trace threshold
`min_spans`	`int \| None`	`None`	Keep traces with more spans than this
`fallback_rate`	`float`	`0.05`	Rate for normal completed traces

CompositeSampler

Chain multiple samplers with AND ("all") or OR ("any") logic to build complex policies.

from agentlens.sampling import (
    CompositeSampler, PrioritySampler, RateLimitSampler
)

# Production: priority-based with rate cap
# "Keep important traces, but never exceed 200/min total"
sampler = CompositeSampler(
    strategies=[
        PrioritySampler(error_always_keep=True, fallback_rate=0.3),
        RateLimitSampler(max_traces=200, window_seconds=60),
    ],
    mode="all",  # both must accept
)

# Lenient: keep if ANY strategy accepts
# "Keep errors OR anything under rate limit"
sampler = CompositeSampler(
    strategies=[
        PrioritySampler(error_always_keep=True, fallback_rate=0.0),
        RateLimitSampler(max_traces=50, window_seconds=60),
    ],
    mode="any",  # either can accept
)

Sampler Statistics

Every sampler tracks thread-safe statistics:

stats = sampler.stats  # SamplerStats snapshot

stats.total_decisions  # total calls to should_sample()
stats.sampled_count    # traces kept
stats.dropped_count    # traces dropped
stats.forced_keep      # AlwaysSampler forced keeps
stats.forced_drop      # NeverSampler forced drops
stats.effective_rate   # sampled / total (float)

stats.to_dict()  # serializable dict for monitoring

# Reset counters
sampler.reset_stats()

Production Patterns

Development

from agentlens.sampling import AlwaysSampler
sampler = AlwaysSampler()  # keep everything during development

Staging

from agentlens.sampling import PrioritySampler
sampler = PrioritySampler(
    error_always_keep=True,
    slow_threshold_ms=2000,
    fallback_rate=0.5,  # 50% of normal traces
)

Production (Recommended)

from agentlens.sampling import (
    CompositeSampler, PrioritySampler, RateLimitSampler
)
sampler = CompositeSampler(
    strategies=[
        PrioritySampler(
            error_always_keep=True,
            slow_threshold_ms=3000,
            fallback_rate=0.1,
            important_tags={"env": "production"},
        ),
        RateLimitSampler(max_traces=500, window_seconds=60),
    ],
    mode="all",
)

High-Volume Production

from agentlens.sampling import (
    CompositeSampler, TailSampler, RateLimitSampler
)
sampler = CompositeSampler(
    strategies=[
        TailSampler(
            error_keep=True,
            slow_threshold_ms=5000,
            min_spans=50,
            fallback_rate=0.02,  # only 2% of normal
        ),
        RateLimitSampler(max_traces=100, window_seconds=60),
    ],
    mode="all",
)

Head-Based vs Tail-Based Sampling

Aspect	Head-Based	Tail-Based
Decision point	Before trace starts	After trace completes
Latency impact	None	Requires buffering
Error coverage	May miss errors	Never misses errors
Memory usage	Low	Higher (buffers traces)
Best for	High throughput	Complete visibility

Tip: Use PrioritySampler (head-based) for most cases. Switch to TailSampler only when you need guaranteed error capture and can afford the buffer overhead.