# Narev Docs

## Docs

- [AI adoption rate for large firms continues to trend down](https://narev.ai/docs/blog/ai-adoption-rate.md): U.S. Census Bureau data shows AI adoption among large firms has continued to decline after peaking in July 2025, while smallest firms keep growing
- [GPT-3.5 MMLU Score: why it beats GPT-5 at 3% of the Cost](https://narev.ai/docs/blog/gpt35-beats-gpt5.md): Despite its low MMLU score, GPT-3.5 outperformed top 2025 models like GPT-5 and Claude Opus on a real task, costing USD 823 vs USD 30,390 per 1M requests.
- [Blog - Index](https://narev.ai/docs/blog/index.md): Read the latest Narev research, product updates, and analysis on LLM cost optimization, AI FinOps, and the future of AI unit economics.
- [GPT-4o vs Claude Opus - MMLU Scores vs. Actual API Costs](https://narev.ai/docs/blog/mmlu-doesnt-matter.md): MMLU scores for GPT-4o vary by 13 points, while top models differ by 1%. The real difference? Massive cost disparities for the same performance.
- [Why Narev AI Exists - Two futures for AI economics](https://narev.ai/docs/blog/why-we-launch.md): There's a world where AI gets infinitely cheaper, and another where users need to choose the best tool for every job. Narev is building for the latter.
- [DeepSeek Usage-Based Billing with Vercel AI SDK](https://narev.ai/docs/guides/deepseek-usage-based-billing.md): Track DeepSeek model usage and calculate costs in a Vercel AI SDK app with the @ai-billing/deepseek middleware to power usage-based billing flows.
- [Calculate DeepSeek V4 model pricing with the Narev API](https://narev.ai/docs/guides/deepseek-v4-pricing.md): Step-by-step guide to looking up DeepSeek V4 token prices and calculating per-request costs through the Narev Cloud pricing API in Python and TypeScript.
- [FinOps for AI Framework](https://narev.ai/docs/guides/finops-for-ai/index.md): A practical three-step FinOps framework to measure, track, and optimize LLM spending in production without compromising quality or user experience.
- [Step 1: know your objective and who's in charge](https://narev.ai/docs/guides/finops-for-ai/step-1.md): Step 1 of Narev's FinOps for AI framework: define success metrics, align stakeholders, and pick the decision-maker before optimizing LLM spend.
- [Step 2: know what you're spending on](https://narev.ai/docs/guides/finops-for-ai/step-2.md): Step 2 of Narev's FinOps for AI framework: track LLM token costs at the source and break aggregate spend down by app, feature, customer, and team.
- [Step 3: optimize LLM spend with benchmarks and phased roll-out](https://narev.ai/docs/guides/finops-for-ai/step-3.md): Step 3 of Narev's FinOps for AI framework: systematically benchmark variants, measure quality, and roll out the winners to cut LLM cost without losing quality.
- [How to choose an LLM model for your product](https://narev.ai/docs/guides/how-to-choose-llm-model.md): A practical guide to picking the best LLM for your product when you don't have labelled data, comparing models on cost, quality, and task fit.
- [How to choose an LLM (with labelled data)](https://narev.ai/docs/guides/how-to-choose-llm-model-with-labels.md): Practical guide to picking the best LLM when you have labelled data, using benchmarks and accuracy metrics to compare candidate models objectively.
- [Guides - Index](https://narev.ai/docs/guides/index.md): Step-by-step guides for cutting LLM costs, choosing models, benchmarking quality, and rolling out usage-based billing with the Narev platform.
- [Reduce LLM spend by switching models](https://narev.ai/docs/guides/reduce-cost-by-model-switch.md): Case study: how switching from GPT-4 to gpt-oss-20b cut LLM inference costs by 99% while keeping 100% accuracy on a real product workload.
- [Reduce LLM spend by prompt engineering](https://narev.ai/docs/guides/reduce-cost-by-prompt-engineering.md): Case study: how rewriting a verbose prompt into a shorter, simpler version cut LLM costs by 24% while preserving accuracy and improving consistency.
- [Why benchmark?](https://narev.ai/docs/guides/why-benchmark.md): Academic LLM benchmarks like MMLU don't predict real-world quality. Learn why benchmarking on your own product data is what actually drives model choice.
- [Introduction](https://narev.ai/docs/index.md): Narev is an AI FinOps platform that helps teams monetize their AI applications in minutes not days. Measure, optimize, and bill for LLM usage.
- [Authentication](https://narev.ai/docs/platform/api-reference/authentication.md): Reference for the Narev Cloud REST APIs
- [Overview](https://narev.ai/docs/platform/api-reference/overview.md): Reference for the Narev Cloud REST APIs
- [Calculate cost for trace](https://narev.ai/docs/platform/api-reference/v1/calculate.md): Given a model ID, provider, and token usage, returns an itemized cost breakdown in USD.
- [Find cheapest provider](https://narev.ai/docs/platform/api-reference/v1/find/cheapest.md): Returns pricing for a model across all providers, sorted ascending by prompt price.
- [List pricing for provider](https://narev.ai/docs/platform/api-reference/v1/price/for-provider.md): Returns pricing for all models of the given provider.
- [Search pricing by model](https://narev.ai/docs/platform/api-reference/v1/price/search.md): Search pricing across all providers by model ID.
- [List models](https://narev.ai/docs/platform/api-reference/v1/reference/models.md): Returns all available models with their provider.
- [Get provider details](https://narev.ai/docs/platform/api-reference/v1/reference/provider-details.md): Returns full details for a single provider.
- [List providers](https://narev.ai/docs/platform/api-reference/v1/reference/providers.md): Returns all supported providers with their display name.
- [Analyze benchmark results](https://narev.ai/docs/platform/benchmark/analyzing-results.md): Read the Narev results dashboard to compare variant cost, quality scores, latency, and token usage across your benchmark runs and pick a winner.
- [Create a benchmark](https://narev.ai/docs/platform/benchmark/create.md): Walk through the four steps to create a benchmark in Narev Cloud, attach a data source, add variants, and start comparing LLM performance.
- [Clawhub](https://narev.ai/docs/platform/benchmark/data-source/clawhub.md): Connect Clawhub to Narev to pull production prompts and responses into your benchmarks so you can evaluate LLM variants on real user traffic.
- [File Upload](https://narev.ai/docs/platform/benchmark/data-source/file-upload.md): Bulk-load prompts and expected outputs into a Narev benchmark from CSV, JSON, or JSONL files to evaluate LLM variants against historical datasets.
- [Live Test](https://narev.ai/docs/platform/benchmark/data-source/live-test.md): Point your app at the Narev gateway endpoint to live-test multiple LLM models simultaneously and capture every request into your benchmark.
- [Manual Entry](https://narev.ai/docs/platform/benchmark/data-source/manual-entry.md): Use the Narev Cloud UI to add prompts and expected outputs by hand for fast, small-scale benchmarks when you do not have a dataset ready to upload.
- [Tracing Platform](https://narev.ai/docs/platform/benchmark/data-source/tracing-platform.md): Sync prompts from Langfuse, LangSmith, Helicone, and other tracing tools into Narev benchmarks without changing your app code or endpoint.
- [Import Helicone traces into a Narev benchmark](https://narev.ai/docs/platform/benchmark/integration/helicone.md): Pull production LLM traces from Helicone into a Narev benchmark dataset so you can evaluate new models against real user prompts and conversations.
- [Benchmark Helicone Gateway configurations with Narev](https://narev.ai/docs/platform/benchmark/integration/helicone-gateway.md): Route production traffic through the Helicone Gateway and Narev to A/B test new model configurations against real user requests before deploying.
- [Narev benchmark integrations: gateways and tracing tools](https://narev.ai/docs/platform/benchmark/integration/index.md): See how Narev complements LLM gateways like OpenRouter, Portkey, and LiteLLM, and tracing tools like Langfuse, LangSmith, Helicone, and W&B Weave.
- [Import Langfuse traces into a Narev benchmark](https://narev.ai/docs/platform/benchmark/integration/langfuse.md): Pull production LLM traces from Langfuse into a Narev benchmark dataset so you can evaluate new models against your real user prompts and conversations.
- [Import LangSmith traces into a Narev benchmark](https://narev.ai/docs/platform/benchmark/integration/langsmith.md): Pull production LangChain traces from LangSmith into a Narev benchmark so you can evaluate new LLM variants against your real production prompts.
- [Benchmark LiteLLM proxy configurations with Narev](https://narev.ai/docs/platform/benchmark/integration/litellm.md): Connect LiteLLM to Narev to A/B test proxy routing, fallbacks, and provider configurations against your real production traffic before deploying.
- [Benchmark OpenAI models in Narev with production data](https://narev.ai/docs/platform/benchmark/integration/openai.md): Connect OpenAI to Narev and benchmark GPT-4, GPT-4o, and other OpenAI models against your live production prompts before shipping changes.
- [Benchmark OpenRouter models in Narev with production data](https://narev.ai/docs/platform/benchmark/integration/openrouter.md): Connect OpenRouter to Narev and benchmark hundreds of LLMs available through OpenRouter against your live production prompts before deployment.
- [Benchmark Portkey gateway configuration with Narev](https://narev.ai/docs/platform/benchmark/integration/portkey.md): Connect Portkey to Narev to A/B test gateway routing rules, fallbacks, and model configurations against your real production traffic.
- [Weights & Biases Weave](https://narev.ai/docs/platform/benchmark/integration/wandb.md): Pull production LLM traces from Weights & Biases Weave into a Narev benchmark so you can evaluate new model variants against real production prompts.
- [Benchmark LLMs programmatically with API](https://narev.ai/docs/platform/benchmark/using-api.md): Use the Narev Cloud Applications API to run benchmark A/B tests programmatically, send requests through OpenAI-compatible endpoints, and pull results.
- [Add variant to a benchmark](https://narev.ai/docs/platform/benchmark/variant/adding-variants.md): Attach model variants to a Narev benchmark from the dashboard, choose the production variant, and configure traffic splits for an A/B test.
- [Create variant](https://narev.ai/docs/platform/benchmark/variant/creating-variants.md): Define a model variant in Narev with model choice, system prompt, temperature, and other inference parameters to compare in a benchmark A/B test.
- [Polar](https://narev.ai/docs/platform/billing/integrations/billing-platforms/polar.md): Send Narev Cloud LLM usage events to Polar and sync customers so you can bill end users for AI consumption with subscriptions and metered pricing.
- [Next.js](https://narev.ai/docs/platform/billing/integrations/frameworks/nextjs.md): Add Narev Cloud usage metering and AI billing to a Next.js app with a billing-platform-agnostic pattern that works with Stripe, Polar, and OpenMeter.
- [Integrations overview](https://narev.ai/docs/platform/billing/overview.md): Connect Narev Cloud to your billing stack with framework integrations like Next.js and billing platforms like Stripe, Polar, OpenMeter, and Kong.
- [Benchmarks](https://narev.ai/docs/platform/concepts/benchmarks.md): Learn how Narev benchmarks group your prompts into evaluation datasets so you can compare LLM variants on cost, quality, and latency side by side.
- [AI pricing models](https://narev.ai/docs/platform/concepts/pricing-models.md): Compare seat, subscription, credit, and usage-based pricing models for AI products and learn how to map inference costs to customer charges with Narev.
- [Quality evaluations](https://narev.ai/docs/platform/concepts/quality-evaluations.md): Configure automatic and human-in-the-loop quality evaluations in Narev to score LLM responses, detect regressions, and pick the best variant.
- [Usage-based billing](https://narev.ai/docs/platform/concepts/usage-based-billing.md): Learn how usage-based billing meters token and compute consumption per customer so AI products keep healthy margins as inference costs scale linearly.
- [Variants](https://narev.ai/docs/platform/concepts/variants.md): A Narev variant captures a model choice, system prompt, and inference parameters so you can A/B test LLM configurations against your own benchmarks.
- [Benchmarking quickstart](https://narev.ai/docs/platform/quickstart/benchmark.md): A hands-on quickstart that walks you through creating a benchmark, adding a model variant, running a test, and comparing LLM quality and cost in Narev.
- [Quickstart - Humans](https://narev.ai/docs/quickstart.md): Set up Narev Cloud, install the Narev SDK, meter Vercel AI SDK calls, tag customer usage, and send billable AI events to Polar.
- [Quickstart - Agents](https://narev.ai/docs/quickstart-agents.md): Connect Cursor, Claude, VS Code, or any MCP-compatible agent to the Narev Docs MCP server and let agents search current AI billing docs.
- [Open Source at Narev](https://narev.ai/docs/sdk/index.md): Discover Narev's open-source initiatives for AI billing and infrastructure cost mapping.

## OpenAPI Specs

- [openapi](https://narev.ai/docs/platform/api-reference/openapi.json)