Guides
Integrate Narev with LangSmith for LLM Cost Optimization

Integrate Narev with LangSmith for LLM Cost Optimization

Import production traces from LangSmith into Narev to test and validate model optimizations. Reduce LLM costs by 99% using real production data through systematic A/B testing.

LangSmith shows you what's happening. Narev shows you what to change. LangSmith captures every LLM interaction in production, giving you visibility into costs, latency, and performance. Narev uses those exact traces to test optimizations before you deploy them.

The Problem with Observability Alone

LangSmith is an excellent LLM observability platform—it gives you complete visibility into your production LLM usage. You can see exactly:

  • Which prompts are most expensive
  • Where latency bottlenecks occur
  • Which models you're using and how often
  • Total costs broken down by endpoint, user, or feature
  • Detailed traces of chains, agents, and tools

But observability alone doesn't solve the problem. Seeing the problem isn't the same as fixing it.

When LangSmith shows you're spending $10,000/month on GPT-4, you're left wondering:

  • Can I switch to a cheaper model without breaking quality?
  • Which of the 400+ available models would work for my specific use case?
  • Will GPT-4o Mini handle my prompts as well as GPT-4?
  • Should I adjust my prompts or change models?

The result? Teams have full observability but still overspend by 10-100x because they lack a systematic way to test alternatives.

How Narev + LangSmith Work Together

Narev and LangSmith are the perfect pairing for LLM optimization:

ToolPurposeWhat It Tells You
LangSmithMonitor production LLM usage"You're spending $10K/month on GPT-4"
NarevTest alternatives systematically"Switch to GPT-4o Mini and save $9K/month"

The workflow:

  1. Monitor production with LangSmith to identify optimization opportunities
  2. Import traces from LangSmith into Narev
  3. Test alternative models, prompts, and parameters with A/B experiments
  4. Deploy validated optimizations to production with confidence
  5. Verify improvements in LangSmith and repeat

Integration Guide

Step 1: Export Production Traces from LangSmith

Narev integrates directly with LangSmith to import your production traces. These traces become the test dataset for your experiments—ensuring you're testing against real-world usage patterns.

To connect LangSmith:

  1. In Narev, go to Import Traces

  2. Select LangSmith as your provider

  3. Enter your LangSmith project credentials:

    • Project Name: Your LangSmith project identifier
    • API Key: Your LangSmith API key (starts with lsv2pt...)
  4. Select your date range (default: last 7 days)

  5. Click Save Project to import traces

Import traces from LangSmith interface

Narev will import your prompts, model configurations, and usage patterns to create realistic test scenarios.

Step 2: Identify Optimization Opportunities

Use LangSmith to spot areas where optimization would have the biggest impact:

💰 High-Cost Chains

Which chains or agents consume the most tokens? These are prime candidates for model switching.

⚡ Latency Bottlenecks

Where are users waiting? Test faster models to improve response times.

📊 High-Volume Runs

Which runs execute most frequently? Small optimizations here yield big savings.

Step 3: Create Experiments with Real Production Data

Let's say LangSmith shows you're spending heavily on a customer support agent using GPT-4. Import those traces to Narev and test alternatives:

Create an experiment comparing:

Variant A (Current)

claude-3-5-haiku-20241022
Your production model from LangSmith traces
Avg cost: $35.85/1M requests
Avg latency: 713.4ms
Quality: 60%

Variant B (Test)

gpt-4o-mini
Alternative to test
Projected cost: $18.36/1M requests (49% cheaper)
To be measured...

Narev will run both variants on your actual production prompts from LangSmith and measure:

  • Cost savings in dollars and percentage
  • Latency differences (time to first token, total time)
  • Quality metrics (accuracy, completeness, formatting)

Step 4: Analyze Results with Statistical Confidence

Narev provides clear, data-backed answers:

Variant comparison showing cost, quality, and latency metrics

Example results:

  • GPT-4o Mini costs 49% less ($18.36 vs $35.85 per 1M requests)
  • Quality improved by 33% (80% vs 60%)
  • Latency improved by 13% (623.4ms vs 713.4ms)

Projected savings: Based on your LangSmith volume data, switching to GPT-4o Mini reduces costs by nearly 50% while improving both quality and latency.

Step 5: Deploy and Monitor

With validated results, confidently deploy your optimization:

# Before: Current model from LangSmith traces
from langchain_anthropic import ChatAnthropic
from langsmith import traceable
 
@traceable
def support_agent(message: str) -> str:
    llm = ChatAnthropic(model="claude-3-5-haiku-20241022")  # ← Old model
    response = llm.invoke(message)
    return response.content
 
# After: Switch to validated alternative
from langchain_openai import ChatOpenAI
 
@traceable
def support_agent(message: str) -> str:
    llm = ChatOpenAI(model="gpt-4o-mini")  # ← Tested winner
    response = llm.invoke(message)
    return response.content

Monitor the impact in LangSmith:

  • Cost reduction appears immediately in your LangSmith dashboards
  • Track quality through user feedback and error rates
  • Compare before/after metrics to validate experiment predictions

Step 6: Continuous Optimization

Use this workflow continuously:

  1. Weekly: Review LangSmith for new optimization opportunities
  2. Test: Import the highest-cost traces into Narev
  3. Validate: Run experiments on new models or prompt variations
  4. Deploy: Roll out proven optimizations
  5. Repeat: As new models launch or usage patterns change

Why Import from LangSmith?

✅ Test with Real Data

Your LangSmith traces represent actual production usage. Testing on real prompts ensures results translate to production.

✅ Realistic Volume Projections

LangSmith shows request volume. Narev multiplies per-request savings by actual volume for accurate ROI estimates.

✅ Representative Edge Cases

Production traces include the weird prompts, long conversations, and edge cases synthetic tests miss.

✅ Zero Setup Time

If you're already using LangSmith, your test data is ready. No need to create synthetic datasets.

The LangSmith → Narev → Production Loop

Without Narev: Risky Guesswork

  1. LangSmith shows high GPT-4 costs
  2. "Maybe a cheaper model would work?"
  3. Deploy to production and hope
  4. Wait weeks for statistically significant data
  5. Quality issues surface → rollback
  6. Lost time + user complaints 💸

With Narev: Data-Driven Confidence

  1. LangSmith shows high GPT-4 costs
  2. Import traces to Narev
  3. Test alternatives on actual production prompts
  4. Get results in 10 minutes with confidence
  5. Deploy winner ✅
  6. Verify savings in LangSmith 💰

Common LangSmith + Narev Use Cases

🎯 Model Migration

LangSmith shows you're using expensive models. Narev tests which chains can safely switch to GPT-4o Mini for better performance and lower costs.

⚡ Latency Optimization

LangSmith identifies slow chains. Narev tests faster models while ensuring quality doesn't drop.

💰 Cost Attribution

LangSmith breaks down costs by chain or agent. Narev optimizes each independently based on its specific traces.

🔧 Chain Optimization

LangSmith shows expensive multi-step chains. Narev A/B tests different models for each step on real data.

Frequently Asked Questions

Getting Started

Step 1: Set Up LangSmith (if not already)

If you're not using LangSmith yet, sign up for free and add the LangSmith SDK to your application for observability.

Step 2: Sign Up for Narev

Sign up for Narev - no credit card required.

Step 3: Connect Your LangSmith Project

Import your traces using your LangSmith project name and API key. Results available immediately.

Step 4: Run Your First Experiment

Compare your current model from LangSmith against 2-3 cheaper alternatives. Get results in minutes.

Step 5: Deploy and Verify

Update your production code with the winning configuration. Watch savings appear in your LangSmith dashboard.

Start Optimizing Today

Stop wondering if you can reduce costs. Start testing systematically with your real production data.