Trace Sampling & Rate Limiting

Control trace volume in production with composable sampling strategies.

Why sample? In production, capturing every trace creates storage and performance overhead. Sampling lets you keep interesting traces (errors, slow operations) while dropping routine ones, reducing costs by 80-95% with minimal observability loss.

Overview

The agentlens.sampling module provides 7 composable sampling strategies:

StrategyTypeBest For
ProbabilisticSamplerHead-basedSimple fixed-rate sampling
RateLimitSamplerHead-basedCapping trace volume per time window
PrioritySamplerHead-basedKeeping errors/slow traces, sampling the rest
TailSamplerTail-basedDeciding after trace completes
CompositeSamplerMetaCombining strategies with AND/OR logic
AlwaysSamplerTestingDevelopment (keep everything)
NeverSamplerTestingTesting (drop everything)

Quick Start

from agentlens.sampling import PrioritySampler, TraceContext

# Keep all errors and slow traces; sample 20% of the rest
sampler = PrioritySampler(
    error_always_keep=True,
    slow_threshold_ms=5000,
    fallback_rate=0.2,
)

# Build context from your trace data
ctx = TraceContext(
    trace_id="abc123",
    duration_ms=1200,
    has_error=False,
)

decision = sampler.should_sample(ctx)
if decision.sampled:
    send_trace_to_backend(trace)
else:
    # Optionally log the drop reason
    print(f"Dropped: {decision.reason.value}")

TraceContext

Every sampling decision takes a TraceContext that carries information about the trace:

FieldTypeDefaultDescription
trace_idstr""Unique trace identifier (used for deterministic hashing)
session_idstr""Session this trace belongs to
agent_idstr""Agent that generated this trace
duration_msfloat | NoneNoneTotal trace duration (None if incomplete)
has_errorboolFalseWhether the trace contains errors
error_typestr | NoneNoneError class name (e.g. "ValueError")
span_countint0Number of spans in the trace
event_countint0Number of events in the trace
priorityint0Priority level (higher = more important)
tagsdict[str, str]{}Key-value tags for tag-based sampling rules
metadatadict[str, Any]{}Additional metadata

SamplingDecision

Every should_sample() call returns a SamplingDecision:

decision = sampler.should_sample(ctx)

decision.sampled   # bool  - whether to keep the trace
decision.reason    # SamplingReason enum - why
decision.strategy  # str   - which sampler made the call
decision.rate      # float - effective sampling rate
decision.metadata  # dict  - extra info (roll value, thresholds, etc.)

# SamplingDecision is truthy:
if decision:
    send_trace(trace)

SamplingReason values

ReasonMeaning
PROBABILISTICRandom coin flip (accepted or rejected)
RATE_ALLOWEDUnder the rate limit cap
RATE_LIMITEDExceeded rate limit, trace dropped
PRIORITY_ERRORKept because trace has errors
PRIORITY_SLOWKept because trace exceeded slow threshold
PRIORITY_IMPORTANTKept due to high priority or matching tag
PRIORITY_FALLBACKNon-priority trace, sampled at fallback rate
TAIL_ERRORTail-based: kept for errors
TAIL_SLOWTail-based: kept for slow/complex traces
TAIL_NORMALTail-based: normal trace, sampled at fallback
COMPOSITEResult of composite AND/OR evaluation
FORCED_KEEPAlwaysSampler forced keep
FORCED_DROPNeverSampler forced drop

ProbabilisticSampler

Simple random sampling at a fixed rate. Supports deterministic hashing so the same trace_id always gets the same decision (useful for distributed systems where multiple services must agree on whether to sample a trace).

from agentlens.sampling import ProbabilisticSampler

# Keep 10% of traces (deterministic by default)
sampler = ProbabilisticSampler(rate=0.1)

# Non-deterministic with seed for reproducibility
sampler = ProbabilisticSampler(rate=0.1, deterministic=False, seed=42)
ParameterTypeDefaultDescription
ratefloat0.1Probability [0.0, 1.0] of keeping a trace
deterministicboolTrueUse trace_id hash for consistent decisions
seedint | NoneNoneRandom seed for non-deterministic mode

RateLimitSampler

Cap the number of traces kept per time window using a sliding window. Thread-safe.

from agentlens.sampling import RateLimitSampler

# Max 100 traces per minute
sampler = RateLimitSampler(max_traces=100, window_seconds=60)

# Check current window usage
print(f"Current: {sampler.current_count()}/{sampler.max_traces}")
ParameterTypeDefaultDescription
max_tracesint100Maximum traces per window (>= 0)
window_secondsfloat60.0Window duration in seconds (> 0)

PrioritySampler

The recommended production sampler. Evaluates priority rules in order and falls back to probabilistic sampling for "boring" traces:

  1. Error traces → always keep
  2. Slow traces (duration > threshold) → always keep
  3. High-priority traces (priority ≥ threshold) → always keep
  4. Tag-matched traces → always keep
  5. Everything else → sample at fallback_rate
from agentlens.sampling import PrioritySampler

sampler = PrioritySampler(
    error_always_keep=True,        # never drop error traces
    slow_threshold_ms=3000,        # keep traces > 3s
    priority_threshold=5,          # keep priority >= 5
    fallback_rate=0.1,             # 10% of normal traces
    important_tags={"env": "production", "user": "vip"},
)
ParameterTypeDefaultDescription
error_always_keepboolTrueKeep all traces with errors
slow_threshold_msfloat | None5000.0Duration threshold (None = disabled)
priority_thresholdint5Minimum priority to always keep
fallback_ratefloat0.1Sampling rate for non-priority traces
important_tagsdict | NoneNoneTag key-value pairs that force keep
seedint | NoneNoneRandom seed for fallback sampling

TailSampler

Decides after a trace completes, based on its outcome. This ensures error and slow traces are never missed, but requires buffering traces until completion.

from agentlens.sampling import TailSampler

sampler = TailSampler(
    error_keep=True,
    slow_threshold_ms=5000,
    min_spans=20,          # keep complex traces with many spans
    fallback_rate=0.05,    # only 5% of normal traces
)
ParameterTypeDefaultDescription
error_keepboolTrueKeep traces with errors
slow_threshold_msfloat5000.0Slow trace threshold
min_spansint | NoneNoneKeep traces with more spans than this
fallback_ratefloat0.05Rate for normal completed traces

CompositeSampler

Chain multiple samplers with AND ("all") or OR ("any") logic to build complex policies.

from agentlens.sampling import (
    CompositeSampler, PrioritySampler, RateLimitSampler
)

# Production: priority-based with rate cap
# "Keep important traces, but never exceed 200/min total"
sampler = CompositeSampler(
    strategies=[
        PrioritySampler(error_always_keep=True, fallback_rate=0.3),
        RateLimitSampler(max_traces=200, window_seconds=60),
    ],
    mode="all",  # both must accept
)

# Lenient: keep if ANY strategy accepts
# "Keep errors OR anything under rate limit"
sampler = CompositeSampler(
    strategies=[
        PrioritySampler(error_always_keep=True, fallback_rate=0.0),
        RateLimitSampler(max_traces=50, window_seconds=60),
    ],
    mode="any",  # either can accept
)

Sampler Statistics

Every sampler tracks thread-safe statistics:

stats = sampler.stats  # SamplerStats snapshot

stats.total_decisions  # total calls to should_sample()
stats.sampled_count    # traces kept
stats.dropped_count    # traces dropped
stats.forced_keep      # AlwaysSampler forced keeps
stats.forced_drop      # NeverSampler forced drops
stats.effective_rate   # sampled / total (float)

stats.to_dict()  # serializable dict for monitoring

# Reset counters
sampler.reset_stats()

Production Patterns

Development

from agentlens.sampling import AlwaysSampler
sampler = AlwaysSampler()  # keep everything during development

Staging

from agentlens.sampling import PrioritySampler
sampler = PrioritySampler(
    error_always_keep=True,
    slow_threshold_ms=2000,
    fallback_rate=0.5,  # 50% of normal traces
)

Production (Recommended)

from agentlens.sampling import (
    CompositeSampler, PrioritySampler, RateLimitSampler
)
sampler = CompositeSampler(
    strategies=[
        PrioritySampler(
            error_always_keep=True,
            slow_threshold_ms=3000,
            fallback_rate=0.1,
            important_tags={"env": "production"},
        ),
        RateLimitSampler(max_traces=500, window_seconds=60),
    ],
    mode="all",
)

High-Volume Production

from agentlens.sampling import (
    CompositeSampler, TailSampler, RateLimitSampler
)
sampler = CompositeSampler(
    strategies=[
        TailSampler(
            error_keep=True,
            slow_threshold_ms=5000,
            min_spans=50,
            fallback_rate=0.02,  # only 2% of normal
        ),
        RateLimitSampler(max_traces=100, window_seconds=60),
    ],
    mode="all",
)

Head-Based vs Tail-Based Sampling

AspectHead-BasedTail-Based
Decision pointBefore trace startsAfter trace completes
Latency impactNoneRequires buffering
Error coverageMay miss errorsNever misses errors
Memory usageLowHigher (buffers traces)
Best forHigh throughputComplete visibility

Tip: Use PrioritySampler (head-based) for most cases. Switch to TailSampler only when you need guaranteed error capture and can afford the buffer overhead.