Cost Optimization

Reduce LLM costs by 30–70% without meaningful quality degradation.

Module: agentlens.cost_optimizer
Analyzes agent event patterns and recommends cheaper model alternatives where task complexity doesn’t require expensive models.

Quick Start

from agentlens.cost_optimizer import CostOptimizer
from agentlens.models import AgentEvent

optimizer = CostOptimizer()

events = [
    AgentEvent(model="gpt-4o", tokens_in=500, tokens_out=100,
               event_type="llm_call"),
    AgentEvent(model="gpt-4-turbo", tokens_in=50, tokens_out=10,
               event_type="classification"),
]

report = optimizer.analyze(events)

print(f"Current cost:  ${report.current_cost_usd:.4f}")
print(f"Optimized:     ${report.optimized_cost_usd:.4f}")
print(f"Savings:       ${report.total_savings_usd:.4f} ({report.total_savings_pct}%)")

How It Works

The optimizer follows a three-step pipeline for every analyzable event:

StepWhat HappensKey Factors
1. Complexity Assessment Scores each event from 0.0 (trivial) to 1.0 (critical) Output ratio, token volume, tool calls, decision traces, event type
2. Model Matching Finds the cheapest model in the recommended tier that fits Same-provider preference, context window compatibility
3. Savings Validation Filters out recommendations below the minimum savings threshold Savings percentage, confidence level, aggressive mode

Complexity Levels

The ComplexityAnalyzer maps each event to one of five levels, each associated with a recommended model tier:

LevelScore RangeRecommended TierTypical Tasks
Trivial< 0.15EconomyFormatting, simple extraction
Low0.15 – 0.30EconomyClassification, simple Q&A
Medium0.30 – 0.50StandardSummarization, code review
High0.50 – 0.75PremiumComplex reasoning, code gen
Critical≥ 0.75FlagshipDeep research, multi-step planning

Complexity Factors

Five weighted factors contribute to the complexity score:

FACTOR_WEIGHTS = {
    "output_ratio":   0.25,   # High output → more generation work
    "token_volume":   0.20,   # Large prompts → more context needed
    "has_tool_call":  0.15,   # Tool usage signals agentic behavior
    "has_decision":   0.20,   # Decision traces → reasoning required
    "event_type":     0.20,   # Some types are inherently complex
}

Model Registry

The built-in registry covers popular models across four tiers:

ModelTierInput $/1MOutput $/1MContext
gpt-4o-miniEconomy$0.15$0.60128K
gpt-3.5-turboEconomy$0.50$1.5016K
claude-3-haikuEconomy$0.25$1.25200K
gpt-4oStandard$2.50$10.00128K
claude-3-sonnetStandard$3.00$15.00200K
claude-3.5-sonnetStandard$3.00$15.00200K
gpt-4-turboPremium$10.00$30.00128K
gpt-4Premium$30.00$60.008K
claude-3-opusFlagship$15.00$75.00200K

Custom Models

Register your own models or override pricing:

from agentlens.cost_optimizer import CostOptimizer, ModelInfo, ModelTier

# Via constructor
optimizer = CostOptimizer(custom_models={
    "my-fine-tuned": ModelInfo(
        name="my-fine-tuned",
        tier=ModelTier.ECONOMY,
        input_cost_per_1m=0.10,
        output_cost_per_1m=0.30,
        max_context=32_000,
        strengths=["classification", "extraction"],
    )
})

# Or after construction
optimizer.register_model("llama-3-70b", ModelInfo(
    name="llama-3-70b",
    tier=ModelTier.STANDARD,
    input_cost_per_1m=0.80,
    output_cost_per_1m=0.80,
    max_context=128_000,
))

Optimization Report

The analyze() method returns an OptimizationReport with these key fields:

FieldTypeDescription
total_eventsintTotal events analyzed
optimizable_eventsintEvents where a cheaper model is recommended
current_cost_usdfloatTotal cost at current model selection
optimized_cost_usdfloatProjected cost after optimization
total_savings_usdfloatDollar savings
total_savings_pctfloatPercentage reduction
recommendationslistPer-event model change suggestions
model_usagedictCount of events per model
tier_distributiondictCount of events per tier
migration_planlistPhased rollout steps
summarystrHuman-readable summary
report = optimizer.analyze(events)

# Check if optimizations were found
if report.has_savings:
    print(report.summary)

    # Inspect individual recommendations
    for rec in report.recommendations:
        print(f"  {rec.current_model} → {rec.recommended_model}")
        print(f"  Saves ${rec.estimated_savings_usd:.4f} ({rec.savings_pct}%)")
        print(f"  Confidence: {rec.confidence.value}")
        print(f"  Risk: {rec.risk}")
        print()

Migration Plan

The optimizer generates a phased migration plan grouped by confidence level:

PhaseConfidenceRiskApproach
1HighLow Quick wins — switch immediately with minimal risk
2MediumMedium A/B test before full rollout
3LowHigh Experimental — requires quality monitoring and rollback
for step in report.migration_plan:
    print(f"Phase {step.phase}: {step.description}")
    print(f"  Models to change: {step.models_to_change}")
    print(f"  Target: {step.target_model}")
    print(f"  Est. savings: {step.estimated_savings_pct}%")

Quick Estimate

For a fast overview without full recommendations, use quick_estimate():

estimate = optimizer.quick_estimate(events)
print(f"Current cost:        ${estimate['current_cost']:.4f}")
print(f"Potential savings:   ${estimate['potential_savings']:.4f}")
print(f"Savings %:           {estimate['savings_pct']}%")
print(f"Overprovisioned:     {estimate['overprovisioned_count']}/{estimate['total_events']}")

Single-Event Suggestion

Get a model recommendation for a single event:

event = AgentEvent(model="gpt-4-turbo", tokens_in=50, tokens_out=10,
                   event_type="classification")

suggestion = optimizer.suggest_model(event)
if suggestion:
    print(f"Consider using {suggestion} instead of {event.model}")
else:
    print("Current model is appropriate for this task")

Configuration

ParameterDefaultDescription
aggressiveFalse Include low-confidence recommendations (higher savings, higher risk)
min_savings_pct10.0 Minimum savings percentage to include a recommendation
custom_modelsNone Dict of additional or overridden model definitions
# Conservative (default) — only high/medium confidence
optimizer = CostOptimizer()

# Aggressive — include all recommendations
optimizer = CostOptimizer(aggressive=True, min_savings_pct=5.0)

Session-Specific Analysis

Analyze events from a specific session:

# Filter events by session ID
report = optimizer.analyze_session_events(all_events, session_id="sess-abc123")
print(f"Session cost: ${report.current_cost_usd:.4f}")

Confidence Levels

Each recommendation carries a confidence assessment based on the complexity score and the tier gap between current and recommended models:

ConfidenceWhen AssignedAction
High Low complexity + small tier gap (≤ 1) Safe to switch immediately
Medium Low complexity + larger gap, or medium complexity + small gap A/B test recommended
Low Higher complexity or large tier gaps Only included in aggressive mode; monitor carefully

Best Practices

  1. Start conservative. Use default settings first. Only enable aggressive=True after validating quality.
  2. A/B test medium-confidence changes. Run 10–20% of traffic through the recommended model and compare output quality.
  3. Update the model registry. Pricing changes frequently. Register updated costs to get accurate savings estimates.
  4. Classify your event types. The more specific the event_type (e.g., "classification" vs "generic"), the better the complexity assessment.
  5. Monitor after switching. Use AgentLens’s evaluation and drift modules to detect quality regressions after model changes.
  6. Review periodically. Run optimization analysis weekly or after major agent changes to catch new savings opportunities.

API Reference

CostOptimizer

MethodReturnsDescription
analyze(events)OptimizationReportFull analysis with recommendations and migration plan
analyze_session_events(events, session_id)OptimizationReportAnalyze events filtered to a specific session
quick_estimate(events)dictFast cost overview without per-event details
suggest_model(event)str | NoneSingle-event model recommendation
register_model(name, info)NoneAdd or update a model in the registry

ComplexityAnalyzer

MethodReturnsDescription
assess(event)ComplexityAssessmentScore an event’s complexity (0.0–1.0) with level and reasoning

Data Classes

ClassKey Fields
ModelInfoname, tier, input_cost_per_1m, output_cost_per_1m, max_context, strengths
ComplexityAssessmentlevel, score, factors, recommended_tier, reasoning
Recommendationcurrent_model, recommended_model, estimated_savings_usd, confidence, risk
MigrationStepphase, description, models_to_change, target_model, estimated_savings_pct
OptimizationReporttotal_events, recommendations, current_cost_usd, total_savings_pct, migration_plan

Enums

EnumValues
ModelTierECONOMY, STANDARD, PREMIUM, FLAGSHIP
ComplexityLevelTRIVIAL, LOW, MEDIUM, HIGH, CRITICAL
ConfidenceHIGH, MEDIUM, LOW