Quickstart - Routing
Dynamically sending prompts to cheaper models can significantly reduce cost
Routers vs Gateways
The words router and gateway are often used interchangeably. In the context of cost optimization, it's helpful to understand the difference between the two.
Router
What it does - chooses which model to use for each request
What it does not do - does not provide a unified interface across providers
Examples: Narev, NotDiamond, Martian
Gateway
What it does - provides one interface to many models
What it does not do - does not choose which model to use for your request
Examples: OpenRouter, LiteLLM, Vertex AI, AWS Bedrock
How routing lowers cost
Model sizes refer to the number of parameters in the model. For example, 70B in the name Llama 3.3 70B Instruct refers to 70 billion parameters that this model contains.
The higher the number of parameters, the more expensive the inference is and the higher the cost per token.
Not all tasks are equally difficult. Simple tasks and can be handled by a smaller, cheaper models. Complex ones by a bigger, mode expensive models.
-
Without a router, you choose one model for your application.
-
With router, you can choose multiple models (cheap and expensive) and select the right model for each request.
Research papers like RouteLLM: Learning to Route LLMs with Preference Data (2024) demonstrates how this approach can reduce costs by over 2x without compromising quality.
Quick start: build a router with Narev
Building a router that send requests across all available free models is a low-friction way to get to know the routing platform. This router will use all the free models in the specified order. If one of the models returns an error, the router will use the next model.
Select router type
Select the sequential fallback to create router that uses the models in the order you defined.

Select the free model template
Go to router Settings and select the Free model template

Reorder and delete fallbacks
The router will execute through models from top to bottom. If a model is not available

Use the router in your application
You need router URL and the API key to now use the fallbacks?

Ready to take routers further?
Filter routers allow you to build a router based on predefined rules to optimize for cost.
Still have questions? Ask on Discord
On This Page
