Chat completions

Endpoint

POST /api/applications/{application_id}/v1/chat/completions

Authentication

Include your Narev API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

You can generate API keys in the Narev Cloud dashboard under Settings → API Keys.

Setup

Python
TypeScript
cURL

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://narev.ai/api/applications/{application_id}/v1"
)

import OpenAI from 'openai'

const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://narev.ai/api/applications/{application_id}/v1',
})

curl -X POST https://narev.ai/api/applications/{application_id}/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Request parameters

Required

messages

array

required

Array of message objects, each with a role (system, user, or assistant) and content string.

Optional

Parameter support varies by model. Check your model’s documentation to confirm which parameters it accepts.

model

string

Model identifier with gateway prefix (for example, openai:gpt-4). If omitted, Narev uses the A/B test’s production variant.

temperature

number

Sampling temperature between 0 and 2. Higher values produce more random output.

top_p

number

Nucleus sampling parameter between 0 and 1. Lower values make output more focused.

top_k

integer

Limits sampling to the K most likely next tokens.

max_tokens

integer

Maximum number of tokens to generate in the response.

frequency_penalty

number

Penalizes tokens based on their frequency in the text so far. Range: -2.0 to 2.0.

presence_penalty

number

Penalizes tokens that have already appeared in the text so far. Range: -2.0 to 2.0.

repetition_penalty

number

Penalizes repeated tokens. Typical range: 0.0 to 2.0.

min_p

number

Minimum probability threshold for token selection. Range: 0 to 1.

seed

integer

Random seed for deterministic generation.

logprobs

boolean

When true, returns log probabilities for each output token.

top_logprobs

integer

Number of top log probabilities to return. Range: 0 to 20. Requires logprobs: true.

response_format

object

Controls the format of the response. Pass {"type": "json_object"} to enable JSON mode.

stop

string | array

Up to four sequences at which the API stops generating further tokens.

stream

boolean

default:"false"

When true, Narev streams the response as server-sent events (SSE).

metadata

object

Custom metadata for tracking and automatic quality evaluation.

Field	Type	Description
`expected_output`	string	Expected response text for automatic quality scoring

Model identifiers

Models use a {gateway}:{model_name} format:

Gateway	Example
`openai`	`openai:gpt-4`
`anthropic`	`anthropic:claude-3-opus-20240229`
`openrouter`	`openrouter:meta-llama/llama-3.1-70b-instruct`
`vertex`	`vertex:gemini-pro`
`bedrock`	`bedrock:amazon.titan-text-express-v1`
`portkey`	`portkey:gpt-4`
`helicone`	`helicone:gpt-4`

Request examples

Basic request

response = client.chat.completions.create(
    model="openai:gpt-4",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

With system prompt

response = client.chat.completions.create(
    model="openai:gpt-4",
    messages=[
        {"role": "system", "content": "You are a helpful geography expert."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

With generation parameters

response = client.chat.completions.create(
    model="openai:gpt-4",
    messages=[
        {"role": "user", "content": "Write a creative story."}
    ],
    temperature=0.9,
    max_tokens=500,
    top_p=0.95
)

With JSON response format

response = client.chat.completions.create(
    model="openai:gpt-4",
    messages=[
        {"role": "user", "content": "Return user data as JSON."}
    ],
    response_format={"type": "json_object"}
)

Streaming

stream = client.chat.completions.create(
    model="openai:gpt-4",
    messages=[
        {"role": "user", "content": "Tell me a story."}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

With quality evaluation

response = client.chat.completions.create(
    model="openai:gpt-4",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ],
    extra_body={
        "metadata": {
            "expected_output": "Paris is the capital of France."
        }
    }
)

Response format

Non-streaming

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "openai:gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Paris is the capital of France."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 7,
    "total_tokens": 20
  }
}

Streaming

Narev sends each token as a server-sent event (SSE) with a data: prefix:

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"openai:gpt-4","choices":[{"index":0,"delta":{"content":"Paris"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"openai:gpt-4","choices":[{"index":0,"delta":{"content":" is"},"finish_reason":null}]}

data: [DONE]

Error responses

All errors return a JSON object with an error field:

{
  "error": {
    "message": "Error description",
    "code": "error_code"
  }
}

Status	Code	Description
`400`	`bad_request`	Invalid request format or parameters
`400`	`model_required`	Model is required when no production variant is set
`401`	`invalid_api_key`	Invalid or missing API key
`402`	`insufficient_credits`	Insufficient credits to complete the request
`404`	`application_not_found`	A/B test ID not found
`500`	`internal_error`	Internal server error

API Documentation

Pricing

Applications

Router

Endpoint

Authentication

Setup

Request parameters

Required

Optional

Model identifiers

Request examples

Basic request

With system prompt

With generation parameters

With JSON response format

Streaming

With quality evaluation

Response format

Non-streaming

Streaming

Error responses

API Documentation

Pricing

Applications

Router

Documentation Index

​Endpoint

​Authentication

​Setup

​Request parameters

​Required

​Optional

​Model identifiers

​Request examples

​Basic request

​With system prompt

​With generation parameters

​With JSON response format

​Streaming

​With quality evaluation

​Response format

​Non-streaming

​Streaming

​Error responses

Endpoint

Authentication

Setup

Request parameters

Required

Optional

Model identifiers

Request examples

Basic request

With system prompt

With generation parameters

With JSON response format

Streaming

With quality evaluation

Response format

Non-streaming

Streaming

Error responses