Applications API Reference
OpenAI-compatible endpoint for A/B tests.
The Applications API provides an OpenAI-compatible endpoint for sending chat completions to your A/B tests.
New to the Applications API? Check out the A/B Testing with the API guide to learn about automatic variant creation, gateway prefixes, and quality evaluation.
Endpoint
POST /api/applications/{application_id}/v1/chat/completions
Authentication
Include your Narev API key in the Authorization header:
Authorization: Bearer YOUR_API_KEYSetup Example
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://narev.ai/api/applications/{application_id}/v1"
)Request Parameters
Required Parameters
| Parameter | Type | Description |
|---|---|---|
messages | array | Array of message objects with role and content |
Optional Parameters
Parameter support varies by model. Check your model's documentation to see which parameters are supported.
| Parameter | Type | Description |
|---|---|---|
model | string | Model identifier with gateway prefix (e.g., openai:gpt-4). If omitted, uses the A/B test's production variant |
temperature | number | Sampling temperature between 0 and 2. Higher values make output more random |
top_p | number | Nucleus sampling parameter between 0 and 1. Lower values make output more focused |
top_k | integer | Limits sampling to the K most likely next tokens |
max_tokens | integer | Maximum number of tokens in the response |
frequency_penalty | number | Penalizes tokens based on their frequency in the text so far (-2.0 to 2.0) |
presence_penalty | number | Penalizes tokens that have appeared in the text so far (-2.0 to 2.0) |
repetition_penalty | number | Penalizes repeated tokens (typically 0.0 to 2.0) |
min_p | number | Minimum probability threshold for token selection (0-1) |
top_a | number | Alternative sampling method (0-1) |
seed | integer | Random seed for deterministic generation |
logprobs | boolean | Return log probabilities of tokens |
top_logprobs | integer | Number of top log probabilities to return (0-20) |
response_format | object | Format of the response (e.g., {"type": "json_object"}) |
stop | string or array | Up to 4 sequences where the API will stop generating |
stream | boolean | Whether to stream the response. Default: false |
metadata | object | Custom metadata for tracking and analysis |
Metadata Fields
| Field | Type | Description |
|---|---|---|
expected_output | string | Expected response for automatic quality evaluation |
Response Format
Non-Streaming Response
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "openai:gpt-4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Paris is the capital of France."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 7,
"total_tokens": 20
}
}Streaming Response
Server-sent events (SSE) format with data: prefix:
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"openai:gpt-4","choices":[{"index":0,"delta":{"content":"Paris"},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"openai:gpt-4","choices":[{"index":0,"delta":{"content":" is"},"finish_reason":null}]}
data: [DONE]
Model Identifiers
Models use the gateway prefix format:
{gateway}:{model_name}
Examples:
openai:gpt-4anthropic:claude-3-opus-20240229openrouter:meta-llama/llama-3.1-70b-instructvertex:gemini-pro
Available Gateways: openai, anthropic, openrouter, vertex, bedrock, portkey, helicone.
See the A/B Testing with the API guide for detailed information on gateway prefixes and how they affect variant creation.
Request Examples
Basic Request
response = client.chat.completions.create(
model="openai:gpt-4",
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)With System Prompt
response = client.chat.completions.create(
model="openai:gpt-4",
messages=[
{"role": "system", "content": "You are a helpful geography expert."},
{"role": "user", "content": "What is the capital of France?"}
]
)With Parameters
response = client.chat.completions.create(
model="openai:gpt-4",
messages=[
{"role": "user", "content": "Write a creative story."}
],
temperature=0.9,
max_tokens=500,
top_p=0.95
)With Advanced Parameters
response = client.chat.completions.create(
model="openai:gpt-4",
messages=[
{"role": "user", "content": "Generate a product description."}
],
temperature=0.7,
max_tokens=200,
frequency_penalty=0.5,
presence_penalty=0.3,
stop=["\n\n", "END"],
seed=42 # For reproducible results
)With JSON Response Format
response = client.chat.completions.create(
model="openai:gpt-4",
messages=[
{"role": "user", "content": "Return user data in JSON format."}
],
response_format={"type": "json_object"}
)Streaming
stream = client.chat.completions.create(
model="openai:gpt-4",
messages=[
{"role": "user", "content": "Tell me a story."}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")With Quality Evaluation
response = client.chat.completions.create(
model="openai:gpt-4",
messages=[
{"role": "user", "content": "What is the capital of France?"}
],
extra_body={
"metadata": {
"expected_output": "Paris is the capital of France.",
}
}
)Custom Metrics API
Submit custom quality metric values for responses generated through the Applications API.
Endpoint
POST /api/applications/{application_id}/quality
Authentication
Use the same API key authentication as the chat completions endpoint:
Authorization: Bearer YOUR_API_KEYRequest Body
{
"response_id": "gen-...",
"metric_name": "custom_eslint_errors",
"value": 4
}Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
response_id | string | Yes | The ID of the response to score (from generation logs) |
metric_name | string | Yes | Your custom metric name (must start with custom_) |
value | number | Yes | Numeric score (higher is typically better) |
Example Request
import requests
response = requests.post(
f"https://narev.ai/api/applications/{application_id}/quality",
headers={
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
},
json={
"response_id": "gen-abc123",
"metric_name": "custom_eslint_errors",
"value": 4
}
)Custom metrics must be created in the A/B test configuration before you can submit values. See Configuring Metrics for details on creating custom metrics.
Error Responses
All errors return a JSON object with an error field:
{
"error": {
"message": "Error description",
"code": "error_code"
}
}HTTP Status Codes
| Status | Code | Description |
|---|---|---|
| 400 | bad_request | Invalid request format or parameters |
| 400 | model_required | Model is required when no production variant is set |
| 401 | invalid_api_key | Invalid or missing API key |
| 402 | insufficient_credits | Insufficient credits to complete request |
| 404 | application_not_found | A/B test ID not found |
| 500 | internal_error | Internal server error |
Additional Resources
- A/B Testing with the API - Learn about variants, gateway prefixes, and quality evaluation
- Router API - Alternative API for simpler routing without A/B testing