Working with Multiple Suites
As your AI agent grows in capability, a single YAML file of scenarios becomes difficult to maintain. AgenticAssure supports splitting your tests across multiple files and directories, with built-in directory scanning that makes it straightforward to organize large test suites.
Why Split Tests Across Files
- Maintainability: A 500-line YAML file is hard to navigate. Smaller, focused files are easier to read, review, and update.
- Ownership: Different team members can own different suite files without merge conflicts.
- Selective execution: You can run a single suite file without loading unrelated scenarios.
- Logical grouping: Separate files for separate concerns — orders, billing, onboarding, safety.
- Suite-level configuration: Each file can define its own timeout, retry, and scorer defaults.
Directory Structure Recommendations
By Feature Area
The most common pattern: one file per feature or domain area of your agent.
scenarios/
orders.yaml
returns.yaml
billing.yaml
onboarding.yaml
faq.yaml
safety.yamlBy Test Type
Organize by the kind of testing each file performs.
scenarios/
happy-path.yaml
edge-cases.yaml
error-handling.yaml
safety-guardrails.yaml
performance.yamlBy Agent (Multi-Agent Systems)
If you have multiple agents, give each its own directory.
scenarios/
support-agent/
core.yaml
edge-cases.yaml
safety.yaml
billing-agent/
invoices.yaml
payments.yaml
search-agent/
queries.yaml
filters.yamlBy Priority
Separate critical tests from extended coverage.
scenarios/
critical/
smoke-tests.yaml
core-workflows.yaml
extended/
edge-cases.yaml
regression.yaml
performance.yamlHybrid Approach
Combine multiple organizational dimensions.
scenarios/
support/
orders/
happy-path.yaml
edge-cases.yaml
returns/
happy-path.yaml
edge-cases.yaml
safety.yaml
billing/
invoices.yaml
payments.yamlHow Directory Scanning Works
When you point AgenticAssure at a directory, the load_scenarios_from_dir() function recursively finds all files with .yml or .yaml extensions, loads each one as a separate suite, and returns the full list.
# Scan the entire scenarios/ directory and all subdirectories
agenticassure run scenarios/The scanning process:
- Walks the directory tree recursively.
- Finds all files matching
**/*.ymland**/*.yaml. - Loads each file as an independent suite using
load_scenarios(). - Returns the suites sorted by file path.
If any file fails to parse or validate, the entire load operation stops with an error pointing to the problematic file.
What Gets Loaded
scenarios/orders.yaml— loadedscenarios/returns.yml— loadedscenarios/support/safety.yaml— loaded (recursive)scenarios/support/edge-cases/timeout.yml— loaded (deeply nested)scenarios/README.md— ignored (not YAML)scenarios/config.json— ignored (not YAML)
Suite Names from Files
If a YAML file includes a suite.name field, that name is used. If the suite block is omitted, the suite name defaults to the filename (without extension).
# File: scenarios/orders.yaml
# Suite name will be "order-tests" (from the suite block)
suite:
name: order-tests
scenarios:
- name: lookup_order
input: "Where is my order?"# File: scenarios/returns.yaml
# Suite name will be "returns" (from the filename)
scenarios:
- name: request_return
input: "I want to return this item"Running a Specific Suite with --suite
When you have multiple suite files loaded, you can run only one by name using the --suite / -s flag.
# Load all files from scenarios/ but only run the suite named "order-tests"
agenticassure run scenarios/ --suite order-tests --adapter myproject.agent.MyAgentYou can also point directly at a single file instead of a directory:
# Load and run only this specific file
agenticassure run scenarios/orders.yaml --adapter myproject.agent.MyAgentThe difference:
--suiteloads all files in the directory, then filters by suite name. This is useful when your config file or adapter setup depends on the full directory context.- Pointing at a single file loads only that file. This is faster and simpler when you want to test one file in isolation.
Suite-Level Configuration
Each suite file can define its own configuration via the suite.config block. These settings override the runner defaults for scenarios in that suite.
suite:
name: slow-integration-tests
description: Tests that call external APIs and may take a while
config:
default_timeout: 120
retries: 2
default_scorers:
- passfail
fail_fast: false
scenarios:
- name: external_api_call
input: "Fetch the latest report"
expected_tools:
- fetch_reportConfiguration Fields
| Field | Type | Default | Description |
|---|---|---|---|
default_timeout | float | 30.0 | Timeout in seconds applied to scenarios that do not specify their own timeout_seconds. |
retries | int | 0 | Number of retry attempts for failed scenarios. The scenario runs up to retries + 1 times total. |
default_scorers | list[str] | ["passfail"] | Scorers applied to scenarios that do not specify their own scorers list. |
fail_fast | bool | false | If true, stop executing scenarios in this suite after the first failure. |
Precedence
Scenario-level settings override suite-level settings:
suite:
name: mixed-timeouts
config:
default_timeout: 60 # Suite default
scenarios:
- name: fast_test
input: "Quick question"
timeout_seconds: 10 # Overrides to 10 seconds
- name: normal_test
input: "Regular question"
# Uses suite default of 60 secondsOrganizing by Feature, Severity, or Agent Type
By Feature
Create one suite file per feature area. This keeps scenarios closely related and makes it easy to run tests for a specific feature during development.
# scenarios/orders.yaml
suite:
name: orders
description: Order management scenarios
scenarios:
- name: create_order
input: "Place an order for Widget A"
expected_tools: [create_order]
tags: [orders, create]
- name: cancel_order
input: "Cancel order #12345"
expected_tools: [cancel_order]
tags: [orders, delete]
- name: order_history
input: "Show my recent orders"
expected_tools: [list_orders]
tags: [orders, read]# Run only order tests during development
agenticassure run scenarios/orders.yaml --adapter myproject.agent.MyAgentBy Severity
Separate critical tests from nice-to-have coverage. Run critical tests on every PR; run the full suite nightly.
# scenarios/critical/smoke-tests.yaml
suite:
name: smoke-tests
description: Must-pass scenarios that gate every deployment
config:
retries: 1
fail_fast: true
scenarios:
- name: agent_responds
input: "Hello"
scorers: [passfail]
tags: [critical, smoke]
- name: core_tool_works
input: "Look up order #TEST-001"
expected_tools: [lookup_order]
tags: [critical, smoke]# CI: run only critical smoke tests
agenticassure run scenarios/critical/ --adapter myproject.agent.MyAgent
# Nightly: run everything
agenticassure run scenarios/ --adapter myproject.agent.MyAgentBy Agent Type
In multi-agent systems, keep each agent’s tests isolated.
# Run tests for just the support agent
agenticassure run scenarios/support-agent/ --adapter myproject.support.SupportAgent
# Run tests for just the billing agent
agenticassure run scenarios/billing-agent/ --adapter myproject.billing.BillingAgentThis also makes it clear which adapter corresponds to which test directory, reducing confusion when different agents have different capabilities and tool sets.
Tips for Working with Multiple Suites
- Keep suite files focused: Each file should cover one cohesive area. If a file grows beyond 20-30 scenarios, consider splitting it further.
- Use consistent naming: Follow a predictable pattern for file names and suite names so team members can find tests quickly.
- Leverage tags across suites: Even with file-based organization, tags add a cross-cutting dimension. Tag scenarios with
criticalacross all suite files, then run--tag criticalto get a cross-cutting smoke test. - Validate before running: Use
agenticassure validate scenarios/to catch syntax errors across all files before spending time and API credits on execution. - Document your structure: If your test directory is complex, add a brief comment at the top of each suite file explaining what it covers.