Quickstart - Benchmarking

Narev is a platform designed to help you lower your Gen AI cost. Benchmarking, routing, observability, and more.

Narev provides the tools, workflows and infrastructure you need to lower your Generative AI spend. Narev provides tools for real-time collaboration on the optimization effort betwen technical and non-technical members of the team.

Narev helps lower the cost through three levers:

  • Switching models: finding cheaper models with the same performance
  • Routing traffic: dynamically choosing the right model for each request
  • Lowering token usage: reducing the volume of tokens used per request

Customizing your journey

Narev is agnostic of what provider you use and it supports many integrations. Our ecosystem is modular to help you on all the steps of the optimization journey.

Public benchmark hub

I want to quickly see what are the best models for my use case

View

Benchmark models

I want to benchmark models on my production data

View

Bespoke Router

I want to configure a router that will route traffic to predefined models

View

Opensource Cost Observability

I want to create a cost observability dashboard for my team

View

Quick start: add model variant to the Hub

Interacting with the public hub is a no-friction way to get to know the benchmarking platform.

About the Hub

Any benchmark run on Narev can be published to the Benchmark Hub. Any visitor can expand a public benchmark by adding a variant - model with system prompt and parameters.

The platform calculates the scores in the background and places the variant on the chart showing:

  • Best in Class models that give the best score for a given price point
  • Overpriced models that underperform for a given price point

View of the benchmark chart showing the best models on the pareto curve in green

Add variant to the Hub

We've setup a small benchmark with 5 prompts from the popular HellaSwag benchmark. Let's walk through the addition to it.

Head to the benchmark page and open the variant sheet

Head to HellaSwag Tutorial with our predegined benchmark.

Click on the Add to benchmark button.

View of the Narev Benchmark Hub with Add to Benchmark button highlighted

Select the model and define the parameters

The sheet gives you an ability to select from hundreds of models and define the model variant.

  1. Select a model. Use the search and filter to find the model.
  2. Give a descriptive name to your variant
  3. (optional) Define the system prompt
  4. (optional) Define the parameters of the model
  5. (optional) Preview how your variant will perform on the prompts defined in the benchmark
  6. (optional) Chat with the variant you've defined

View of the variant selection sheet with annotated places in the UI

Submit the variant to the benchmark

Click on the Add to benchmark button.

In the background, the Narev platform will run all the prompts in the benchmark against your variant and give it a score. The calculated score takes into account the price and correctness of the response.

Your model will appear on the optimization chart and the leaderboard.

View of the leaderboard for the models on the public Benchmark Hub


Take it further

The natural next step is to get familiar with routers.