Running Tests
Execute A/B tests and generate experiments to compare variant performance.
Once you've configured your A/B test with data sources, variants, and metrics, you're ready to run experiments and analyze results.
The Testing Flow
Running an A/B test follows a three-step workflow through the interface tabs:
1. Setup Tab
Configure all the necessary components:
- Add your data source (prompts to test)
- Select or create at least 2 variants to compare
- Enable quality metrics for evaluation
2. Run Your Test
Once everything is configured in Setup, click the Run button to start an experiment:
- The button appears in the top-right corner of the A/B test interface
- Click Run to generate a new experiment
- The platform will:
- Send each prompt to all selected variants in parallel
- Collect responses from each variant
- Automatically evaluate outputs using your configured metrics
- Calculate cost and latency for each variant
Each run creates a new experiment—a snapshot of results for all variants at that moment. This lets you track performance changes over time as you adjust configurations.
3. Results Tab
After running, you'll be automatically redirected to the Results tab where you can:
- View aggregated metrics across all variants
- Compare cost, latency, and quality side-by-side
- Drill down into individual prompts and responses
- Identify which variant performs best for your use case
When to Re-run Tests
Important: After making changes to your A/B test configuration, you must run again to see the impact. Previous results remain unchanged until you generate a new experiment.
You need to click Run again whenever you:
Change Variants
- Add or remove variants from your test
- Modify variant settings (model, system prompt, parameters)
- Adjust temperature, max tokens, or other model parameters
The most common scenario is tweaking variant configurations to improve performance. Each change requires a new run to generate fresh results.