Overview
A test suite consists of:- Target agent — the agent being tested
- Tester — either the built-in system tester (simpler, requires no setup) or a custom agent you’ve created that role-plays as the caller
- Test cases — individual scenarios, each with a script and scoring rubric
- Test runs — executions of the full suite, producing pass/fail results for each case
Creating a Test Suite
From the Dashboard
- Navigate to Test Suites in the sidebar
- Click New Test Suite
- Enter a Name
- Select the Target Agent to test
- Choose a tester mode:
- Use system tester — simpler, no setup required. RevRing’s built-in tester follows each test script automatically.
- Use custom agent as tester — select one of your own agents to play the caller role. Useful when you need the tester to have specific voice, language, or behavioral settings.
- Set Max Concurrency to control how many test calls run in parallel (min 2, max 100, default 2). The minimum is 2 because each test requires two concurrent calls — one outbound from the tester and one inbound to the target.
- Click Create
Default Variables can be added after creation by editing the test suite. Open the suite and click Edit to set variables that apply to all test calls.
Via API
Writing Test Cases
Each test case has three key fields:| Field | Description |
|---|---|
| Name | A descriptive name for the test (e.g., “Handles refund request correctly”) |
| Script | Instructions for the tester agent — what to say, what scenario to simulate |
| Scoring Rubric | Criteria for evaluating the target agent’s performance — what constitutes a pass or fail |
Script
The script tells the tester agent how to behave during the call. Write it as natural language instructions:Scoring Rubric
The scoring rubric defines what a successful interaction looks like. After the call, the transcript is evaluated against these criteria:Creating Test Cases
Dashboard: Open your test suite, go to the Configure Tests tab, click Add Test, and fill in the name, script, and scoring rubric. API:attemptsPerRun field (1–10, default 1) controls how many times each test case is executed per run. Use multiple attempts to test for consistency.
AI-Generated Test Cases
You can also generate test cases automatically using AI. Provide a prompt describing the scenarios you want to test, and RevRing generates test cases based on your target agent’s configuration: Dashboard: Click Generate Tests in your test suite and enter a prompt describing the scenarios you want. RevRing generates test cases based on your target agent’s configuration. You can refine the results with follow-up messages — the generator keeps the conversation context so you can ask for adjustments or additional scenarios. Each generated test can be individually added to your suite or dismissed. Review each one before adding to make sure the script and rubric match your expectations. API:name, script, and scoringRubric fields that you can review, edit, and save.
Running Tests
Starting a Test Run
Dashboard: Go to the Runs tab and click Run Tests. Enter a run name (or accept the auto-generated timestamp name) and click Start. You can monitor progress in real time. API:Test Run Lifecycle
| Status | Description |
|---|---|
queued | Run is queued and waiting to start |
running | Test calls are being placed and scored |
completed | All test cases have been evaluated |
failed | The run encountered an error |
cancelled | The run was manually cancelled |
Monitoring Progress
Each test run tracks summary statistics:- Total Tests — total number of test attempts
- Passed — number of attempts that passed
- Failed — number of attempts that failed
Test Attempt Results
Each test attempt produces:| Field | Description |
|---|---|
result | pass, fail, or error |
reasoning | AI-generated explanation of why the test passed or failed |
transcript | The full conversation transcript |
callId | Link to the outbound call record |
inboundCallId | Link to the inbound call record (on the target agent) |
Exporting Results
After a run completes, you can export the results as JSON. Filter by pass, fail, or error before exporting to narrow down the data. Results can be copied to clipboard or downloaded as a file.Cancelling a Run
Dashboard: Click Stop on a running test run. API:queued or running test runs can be cancelled.
Test Variables
Variables can be set at multiple levels. More specific values override broader ones:- Default variables on the test suite — apply to all test calls
- Per-test-case variables — override suite defaults for a specific test
Best Practices
Writing Effective Scripts
- Be specific about what the tester should say and do
- Include numbered steps for multi-turn conversations
- Specify how the tester should respond to common agent behaviors
- Keep scripts focused on one scenario per test case
Writing Effective Rubrics
- Use clear PASS/FAIL criteria
- Be specific about what constitutes success vs. failure
- Include both positive (must happen) and negative (must not happen) criteria
- Account for acceptable variations in agent responses
Test Organization
- Group related tests in the same suite (e.g., “Refund Scenarios”, “Appointment Booking”)
- Use descriptive names so test results are easy to understand at a glance
- Start with key happy-path scenarios, then add edge cases
- Run tests after every agent prompt change to catch regressions
Concurrency
ThemaxConcurrency setting controls how many test calls run simultaneously (minimum 2, maximum 100). Higher concurrency completes runs faster but counts against your organization’s call concurrency limit. Start with a low value (2–5) and increase as needed.
Troubleshooting
Test attempt shows 'error' result
Test attempt shows 'error' result
An
error result means the test call itself failed (e.g., connection issue, agent not reachable). This is different from a fail, which means the call completed but didn’t meet the rubric criteria. Check the call logs for the specific error message.Tests are running slowly
Tests are running slowly
Increase
maxConcurrency on the test suite to run more calls in parallel. Note that concurrent test calls count against your organization’s overall concurrency limit.Scoring seems incorrect
Scoring seems incorrect
Review the scoring rubric for ambiguous criteria. The AI evaluator takes the rubric literally — vague rubrics produce inconsistent results. Use multiple attempts per run (
attemptsPerRun) to identify inconsistency.Tester agent not following the script
Tester agent not following the script
If using a custom tester agent, ensure its prompt instructs it to follow the test script closely. The script is injected into the tester agent’s context, but a conflicting system prompt may override it. If you don’t need special tester behavior, consider switching to the system tester instead.
'No available from number' error
'No available from number' error
This means the tester and target agent are sharing the same SIP trunk phone numbers. Each test call requires two participants on separate lines — the tester calls the target. Add another phone number to your SIP trunk so the tester can call from a different number than the target receives on.