Quickstart - Routing

Dynamically sending prompts to cheaper models can significantly reduce cost

Routers vs Gateways

The words router and gateway are often used interchangeably. In the context of cost optimization, it's helpful to understand the difference between the two.

Router

What it does - chooses which model to use for each request

What it does not do - does not provide a unified interface across providers

Examples: Narev, NotDiamond, Martian

Gateway

What it does - provides one interface to many models

What it does not do - does not choose which model to use for your request

Examples: OpenRouter, LiteLLM, Vertex AI, AWS Bedrock

How routing lowers cost

Model sizes refer to the number of parameters in the model. For example, 70B in the name Llama 3.3 70B Instruct refers to 70 billion parameters that this model contains.

The higher the number of parameters, the more expensive the inference is and the higher the cost per token.

Not all tasks are equally difficult. Simple tasks and can be handled by a smaller, cheaper models. Complex ones by a bigger, mode expensive models.

  • Without a router, you choose one model for your application.

  • With router, you can choose multiple models (cheap and expensive) and select the right model for each request.

Research papers like RouteLLM: Learning to Route LLMs with Preference Data (2024) demonstrates how this approach can reduce costs by over 2x without compromising quality.

Quick start: build a router with Narev

Building a router that send requests across all available free models is a low-friction way to get to know the routing platform. This router will use all the free models in the specified order. If one of the models returns an error, the router will use the next model.

Create a new router

Head to https://www.narev.ai/router and click on Create.

Narev platform with router view and Create button magnified

Select router type

Select the sequential fallback to create router that uses the models in the order you defined.

Narev platform dialog with sequential fallback highlighted

Select the free model template

Go to router Settings and select the Free model template

Narev router setup page with Free Model template annotated

Reorder and delete fallbacks

The router will execute through models from top to bottom. If a model is not available

Narev router fallbacks annotated highlighting reordering of variants

Use the router in your application

You need router URL and the API key to now use the fallbacks?

Narev router overview annotated highlighting the router URL and the API key management

Ready to take routers further?

Filter routers allow you to build a router based on predefined rules to optimize for cost.