Routing with the API
Use the Router API to dynamically route requests to different A/B tests.
The Router API provides intelligent request routing based on filters and rules you configure in the Narev UI. Instead of specifying models in your code, you define routing logic that determines which A/B test handles each request.
Quick Start
Replace your OpenAI base URL with your Narev router endpoint:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_NAREV_API_KEY",
base_url="https://narev.ai/api/router/{router_id}/v1"
)
# No model needed - router decides
response = client.chat.completions.create(
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)That's it! The router evaluates your configured filters and routes the request to the appropriate A/B test.
How Routing Works
When a request arrives at the Router API:
- Filter Evaluation - The router evaluates filters in priority order
- A/B Test Selection - The first matching filter determines which A/B test to use
- Production Variant - The selected A/B test's production variant handles the request
- Response - The response is returned with the model identifier used
graph LR
A[Request] --> B[Router]
B --> C{Filter 1}
C -->|Match| D[A/B Test A]
C -->|No Match| E{Filter 2}
E -->|Match| F[A/B Test B]
E -->|No Match| G{Fallback}
G --> H[Default A/B Test]The router uses the production variant of the matched A/B test. Ensure all A/B tests referenced in your routing rules have a production variant configured.
Understanding Filters
Filters are rules that determine which A/B test handles a request. Each filter can evaluate:
1. Message Content
Match based on keywords, patterns, or semantic meaning in user messages:
Examples:
- "Contains 'code' or 'python'" → Code A/B Test
- "Contains 'story' or 'creative'" → Creative Writing A/B Test
- "Question about geography" → Factual A/B Test
2. Message Length
Route based on prompt complexity:
Examples:
- "Less than 50 tokens" → Quick Response A/B Test (GPT-3.5)
- "More than 500 tokens" → Complex A/B Test (GPT-4)
3. Metadata Fields
Route based on custom metadata you include in requests:
Examples:
user_tier == "premium"→ Premium A/B Testcomplexity == "high"→ Advanced Model A/B Testlanguage == "spanish"→ Spanish A/B Test
4. Combination Rules
Create complex logic combining multiple conditions:
Examples:
- "Message contains 'code' AND user_tier is 'premium'" → Premium Code A/B Test
- "Message length > 200 OR complexity is 'high'" → GPT-4 A/B Test
Configuring Routing Rules
Creating a Router
- Go to Routers in the Narev dashboard
- Click New Router
- Give it a descriptive name
- Add filters and assign A/B tests
Filter Priority
Filters are evaluated in order from top to bottom:
Priority 1: Premium users → Premium A/B Test
Priority 2: Code-related → Code A/B Test
Priority 3: Long prompts → GPT-4 A/B Test
Fallback: Default → GPT-3.5 A/B Test
Order filters from most specific to most general. Put narrow filters (like premium users) before broad filters (like content type).
Setting a Fallback
Always configure a fallback A/B test to handle requests that don't match any filter:
# Without fallback: Request fails if no filter matches
# With fallback: Request always succeeds, routed to default A/B testBest Practice: Use a reliable, cost-effective model (like GPT-3.5 Turbo) as your fallback.
Request Parameters Ignored
Unlike the Applications API, the Router API ignores certain parameters because they're determined by the routed A/B test's configuration:
| Parameter | Ignored? | Reason |
|---|---|---|
model | ✅ Yes | Determined by matched A/B test's production variant |
temperature | ✅ Yes | Determined by matched A/B test's production variant |
top_p | ✅ Yes | Determined by matched A/B test's production variant |
max_tokens | ✅ Yes | Determined by matched A/B test's production variant |
messages | ❌ No | Used for routing decisions and passed to A/B test |
metadata | ❌ No | Used for routing decisions and tracking |
stream | ❌ No | Honored by the router and routed A/B test |
# These parameters are IGNORED by the router
response = client.chat.completions.create(
model="openai:gpt-4", # ❌ Ignored - app config determines model
temperature=0.9, # ❌ Ignored - app config determines temp
messages=[{"role": "user", "content": "Hello"}], # ✅ Used for routing
extra_body={
"metadata": { # ✅ Used for routing decisions
"user_tier": "premium"
}
}
)Metadata-Based Routing
Include metadata in your requests to enable sophisticated routing logic:
User Segmentation
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"user_tier": "premium",
"user_id": "user_123"
}
}
)Router Configuration:
- Filter:
user_tier == "premium"→ Premium A/B Test (GPT-4) - Fallback → Standard A/B Test (GPT-3.5)
Task Complexity
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"complexity": "high",
"estimated_tokens": 1500
}
}
)Router Configuration:
- Filter:
complexity == "high"→ Complex Tasks A/B Test (GPT-4 Turbo) - Filter:
estimated_tokens > 1000→ Long Context A/B Test (Claude) - Fallback → Standard A/B Test
Domain-Specific Routing
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"domain": "code",
"language": "python",
"task_type": "debugging"
}
}
)Router Configuration:
- Filter:
domain == "code"→ Code A/B Test (GPT-4) - Filter:
domain == "creative"→ Creative A/B Test (Claude) - Filter:
domain == "data"→ Data Science A/B Test (GPT-4) - Fallback → General A/B Test
Content-Based Routing
The router can analyze message content to determine routing:
Keyword Matching
# "Write Python code to sort a list"
# Router detects "code" keyword → Routes to Code A/B TestSemantic Understanding
# "I need help debugging my application"
# Router understands this is code-related → Routes to Code A/B TestQuestion Type
# "What is 2+2?"
# Router identifies simple factual question → Routes to Quick Answers A/B TestContent-based routing is configured in the Narev UI using natural language descriptions. The router uses AI to evaluate whether messages match your criteria.
Use Case Examples
1. Cost Optimization
Route simple queries to cheaper models, complex queries to premium models:
Router Configuration:
Filter 1: Message length < 100 tokens → GPT-3.5 Turbo A/B Test
Filter 2: Message length >= 100 tokens → GPT-4 A/B Test
Code (same for all requests):
response = client.chat.completions.create(
messages=[{"role": "user", "content": user_query}]
)Cost Savings: ~70% reduction by routing simple queries to GPT-3.5.
2. Domain-Specific Routing
Route to specialized A/B tests based on task type:
Router Configuration:
Filter 1: Contains "code" or "programming" → Code Assistant (GPT-4)
Filter 2: Contains "creative" or "story" → Creative Writer (Claude)
Filter 3: Contains "analyze" or "data" → Data Analyst (GPT-4 Turbo)
Fallback: General Assistant (GPT-3.5 Turbo)
Code:
# All handled by the same endpoint
response = client.chat.completions.create(
messages=[{"role": "user", "content": user_query}]
)3. Gradual Model Rollout
Test new models with a subset of users:
Router Configuration:
Filter 1: user_id % 10 < 2 → New Model A/B Test (Claude 3.5)
Filter 2: Always → Current Model A/B Test (GPT-4)
Code:
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"user_id": int(user_id)
}
}
)Result: 20% of users get the new model, 80% stay on current model.
4. Tiered Service Levels
Provide different service quality based on user tier:
Router Configuration:
Filter 1: user_tier == "enterprise" → Premium A/B Test (GPT-4 Turbo, fast servers)
Filter 2: user_tier == "pro" → Professional A/B Test (GPT-4)
Filter 3: user_tier == "free" → Free A/B Test (GPT-3.5, rate limited)
Code:
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"user_tier": user.subscription_tier
}
}
)5. Geographic Routing
Route to region-specific A/B tests for latency optimization:
Router Configuration:
Filter 1: region == "us-east" → US East A/B Test (OpenAI US)
Filter 2: region == "eu-west" → EU West A/B Test (OpenAI EU)
Filter 3: region == "asia" → Asia A/B Test (Azure OpenAI Asia)
Code:
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"region": detect_user_region()
}
}
)6. Fallback Chains for Reliability
Create automatic failover between providers:
Router Configuration:
Filter 1: Try primary OpenAI A/B Test
- On failure, automatic retry triggers fallback filter
Filter 2 (Fallback): Anthropic A/B Test
- On failure, triggers second fallback
Filter 3 (Final Fallback): OpenRouter A/B Test
This requires configuring retry logic in your router settings.
Testing Your Routing Logic
Test Suite Approach
Create a test suite to verify routing behavior:
test_cases = [
{
"query": "Write Python code to sort a list",
"metadata": {"user_tier": "free"},
"expected_app": "Code A/B Test"
},
{
"query": "Tell me a creative story",
"metadata": {"user_tier": "free"},
"expected_app": "Creative A/B Test"
},
{
"query": "What is 2+2?",
"metadata": {"user_tier": "free"},
"expected_app": "Quick Answers A/B Test"
},
{
"query": "Complex data analysis task",
"metadata": {"user_tier": "premium"},
"expected_app": "Premium A/B Test"
}
]
for test in test_cases:
response = client.chat.completions.create(
messages=[{"role": "user", "content": test["query"]}],
extra_body={"metadata": test["metadata"]}
)
# The model field shows which app was used
print(f"Query: {test['query']}")
print(f"Routed to: {response.model}")
print(f"Expected: {test['expected_app']}")
print()Routing Analytics
View routing performance in the Narev dashboard:
- Distribution Charts - Which A/B tests receive what percentage of traffic
- Filter Match Rates - How often each filter matches
- Response Metrics - Latency, cost, and quality by routing rule
- Fallback Usage - How often the fallback is triggered
If your fallback A/B test receives more than 30% of traffic, your filters may be too narrow. Consider adding broader matching rules.
Best Practices
1. Always Configure a Fallback
# ❌ Bad: No fallback
Filter 1: Contains "code" → Code A/B Test
# If no filter matches → Request fails
# ✅ Good: Fallback configured
Filter 1: Contains "code" → Code A/B Test
Fallback: General A/B Test
# All requests succeed2. Order Filters by Specificity
# ✅ Good Order (specific to general)
Filter 1: user_tier == "enterprise" AND domain == "code" → Premium Code A/B Test
Filter 2: user_tier == "enterprise" → Premium A/B Test
Filter 3: domain == "code" → Code A/B Test
Filter 4: Fallback → General A/B Test
# ❌ Bad Order (general first blocks specific)
Filter 1: domain == "code" → Code A/B Test
Filter 2: user_tier == "enterprise" AND domain == "code" → Premium Code A/B Test
# Filter 2 never matches because Filter 1 catches all code requests first3. Provide Routing Context in Metadata
# ✅ Good - Rich metadata helps routing
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"user_tier": "premium",
"task_type": "analysis",
"expected_length": "long",
"domain": "finance"
}
}
)
# ❌ Less effective - Missing context
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}]
)4. Use Consistent Metadata Keys
Define a metadata schema and stick to it:
# ✅ Good - Consistent naming
METADATA_SCHEMA = {
"user_tier": ["free", "pro", "enterprise"],
"task_type": ["question", "generation", "analysis"],
"complexity": ["low", "medium", "high"],
"domain": ["code", "creative", "data", "general"]
}
# Use it consistently
metadata = {
"user_tier": "pro",
"task_type": "generation",
"complexity": "medium",
"domain": "creative"
}5. Test Routing Before Production
Deploy routers in stages:
- Development Router - Test with synthetic data
- Staging Router - Test with replica production traffic
- Canary Router - Route 5% of production traffic
- Production Router - Full production rollout
6. Monitor Fallback Usage
High fallback usage indicates:
- Filters too narrow - Add broader matching rules
- Unexpected request patterns - Add filters for new patterns
- Metadata missing - Ensure clients send expected metadata
# Check if fallback is overused
fallback_percentage = (fallback_requests / total_requests) * 100
if fallback_percentage > 30:
print("⚠️ Fallback overused - review filter configuration")7. Handle System Messages Appropriately
Important: The router filters out system messages before routing evaluation. Only user and assistant messages are considered for content-based routing.
If you need system prompts:
- Configure them in the A/B test's production variant
- Don't rely on system messages in requests for routing decisions
# ❌ Bad - System message won't affect routing
response = client.chat.completions.create(
messages=[
{"role": "system", "content": "You are a code assistant"}, # Not used for routing
{"role": "user", "content": "Help me debug"}
]
)
# ✅ Good - Use metadata for routing, configure system prompt in A/B test
response = client.chat.completions.create(
messages=[
{"role": "user", "content": "Help me debug"}
],
extra_body={
"metadata": {
"domain": "code" # This affects routing
}
}
)
# Router routes to Code A/B Test, which has system prompt configuredDebugging Routing Issues
Check Which A/B Test Was Used
The response's model field shows which A/B test handled the request:
response = client.chat.completions.create(
messages=[{"role": "user", "content": "Test query"}]
)
print(f"Model used: {response.model}")
# Output: "openai:gpt-4" (indicates GPT-4 A/B test was used)Review Routing History
In the Narev dashboard:
- Go to Routers → Select your router
- Click Analytics tab
- View request history with routing decisions
Each request shows:
- Which filter matched (or if fallback was used)
- Which A/B test handled it
- Response time and cost
- Any errors
Common Issues
Issue: Request routed to fallback unexpectedly
Possible causes:
- Metadata keys don't match filter configuration
- Message content doesn't match filter patterns
- Filter conditions are too strict
Solution:
# Add debug metadata to track routing
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"debug_routing": True,
"expected_filter": "code_filter",
# ... other metadata
}
}
)Issue: Wrong A/B test handling requests
Possible causes:
- Filter priority order is incorrect
- Multiple filters matching (first match wins)
- Metadata values don't match filter expectations
Solution: Review filter order and test with specific examples.
Issue: No production variant error
Error message:
{
"error": {
"message": "A/B test 'Code Helper' does not have a production variant configured",
"code": "no_production_variant"
}
}Solution: Configure a production variant for the A/B test in the Narev dashboard.
Advanced Routing Patterns
Time-Based Routing
Route based on time of day to optimize for peak hours:
Router Configuration:
Filter 1: hour >= 9 AND hour <= 17 → Fast Response A/B Test (more instances)
Filter 2: hour < 9 OR hour > 17 → Standard A/B Test (fewer instances)
Code:
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"hour": datetime.now().hour
}
}
)Load-Based Routing
Route to different A/B tests based on current load:
Code:
import random
# Distribute load across multiple A/B tests
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"load_bucket": random.randint(0, 9) # 0-9
}
}
)Router Configuration:
Filter 1: load_bucket < 3 → A/B Test Pool A (30%)
Filter 2: load_bucket < 7 → A/B Test Pool B (40%)
Filter 3: Fallback → A/B Test Pool C (30%)
Multi-Step Routing
For complex workflows, chain multiple router calls:
# Step 1: Route to classifier
classifier_response = classifier_client.chat.completions.create(
messages=[{"role": "user", "content": f"Classify this query: {query}"}]
)
classification = classifier_response.choices[0].message.content
# Step 2: Route to specialized handler based on classification
response = main_client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"classification": classification
}
}
)Migration from Applications API
If you're currently using the Applications API and want to migrate to the Router API:
Before (Applications API)
# Code determines which A/B test to use
if task_type == "code":
base_url = "https://narev.ai/api/applications/app_code_123/v1"
elif task_type == "creative":
base_url = "https://narev.ai/api/applications/app_creative_456/v1"
else:
base_url = "https://narev.ai/api/applications/app_general_789/v1"
client = OpenAI(api_key=api_key, base_url=base_url)
response = client.chat.completions.create(
model="openai:gpt-4",
messages=[{"role": "user", "content": query}]
)After (Router API)
# Single endpoint - router handles the logic
client = OpenAI(
api_key=api_key,
base_url="https://narev.ai/api/router/router_123/v1"
)
response = client.chat.completions.create(
messages=[{"role": "user", "content": query}],
extra_body={
"metadata": {
"task_type": task_type
}
}
)Benefits:
- Simpler code - no conditional logic
- Centralized routing configuration
- Easy to update routing rules without code changes
- Better analytics and monitoring
Next Steps
- See the Router API Reference for complete endpoint documentation
- Compare with Applications API for explicit model selection
- Configure your first router in the Narev dashboard
- View routing analytics to optimize your filter configuration