agenticassure run
The run command is the primary entry point for executing test scenarios against your AI agent. It loads YAML scenario files, resolves an adapter, runs each scenario through the adapter, scores the results, and outputs a report.
agenticassure run [OPTIONS] [PATH]Arguments
| Argument | Default | Description |
|---|---|---|
PATH | . (current directory) | A file or directory containing YAML scenario files. Must exist. |
Options
| Option | Short | Type | Default | Description |
|---|---|---|---|---|
--adapter | -a | String | None | Python dotted path to an AgentAdapter class |
--suite | -s | String | None | Filter to a specific suite by name |
--tag | -t | String (repeatable) | None | Filter scenarios by tag. Can be specified multiple times. |
--output | -o | Choice: cli, json, html | cli | Output/report format |
--timeout | Float | 30.0 | Default timeout in seconds per scenario | |
--retry | Integer | 0 | Number of retries per scenario on failure | |
--dry-run | Flag | false | Validate and list scenarios without executing them | |
--help | Flag | Show help and exit |
Path Resolution
The PATH argument determines how scenarios are loaded:
- Single file: If
PATHpoints to a.ymlor.yamlfile, that single file is loaded as one suite. - Directory: If
PATHpoints to a directory, AgenticAssure recursively scans for all.ymland.yamlfiles and loads each as a suite. - Default: If no path is given, the current working directory (
.) is scanned.
# Load a single file
agenticassure run scenarios/search_tests.yaml --adapter my_agent.MyAgent
# Load all YAML files in a directory (recursive)
agenticassure run scenarios/ --adapter my_agent.MyAgent
# Use current directory
agenticassure run --adapter my_agent.MyAgentAdapter Resolution
AgenticAssure needs an adapter to execute scenarios. The adapter is a Python class that implements the AgentAdapter protocol and acts as the bridge between AgenticAssure and your agent. The adapter is resolved in the following order:
--adapterflag — If provided on the command line, this takes priority. The value is a Python dotted path likemymodule.MyAgent.- Config file — If no flag is provided, AgenticAssure looks for a config file in the current working directory:
agenticassure.yaml— checked first, looks for anadapterkey.agenticassure.toml— checked second, looks for anadapterkey.
- No adapter found — If neither source provides an adapter, AgenticAssure displays the loaded scenarios in a dry-run style table and prints instructions on how to provide an adapter. It does not exit with an error in this case.
Config file example (agenticassure.yaml):
adapter: my_agent.MyAgentConfig file example (agenticassure.toml):
adapter = "my_agent.MyAgent"The adapter class is dynamically imported and instantiated. It must:
- Be importable from the current Python environment (installed or on
PYTHONPATH). - Have a no-argument constructor.
- Implement the
AgentAdapterprotocol (i.e., have arun(input, context=None) -> AgentResultmethod).
If any of these conditions are not met, AgenticAssure raises a descriptive error.
Output Formats
The --output flag controls how results are presented. Each format is described in detail in the Reports section.
CLI (default)
agenticassure run scenarios/ --adapter my_agent.MyAgent
# or explicitly:
agenticassure run scenarios/ --adapter my_agent.MyAgent --output cliResults are printed as a Rich-formatted table directly in the terminal with color-coded pass/fail status, scores, durations, and details.
JSON
agenticassure run scenarios/ --adapter my_agent.MyAgent --output jsonWrites a structured JSON file named results_{run_id}.json to the current directory. The run ID is a UUID generated for each run.
HTML
agenticassure run scenarios/ --adapter my_agent.MyAgent --output htmlWrites a standalone HTML file named report_{run_id}.html to the current directory. The file includes embedded CSS and requires no external dependencies to open.
Tag Filtering
Use --tag to run only scenarios that have a matching tag. Tags are defined per-scenario in the YAML file. The flag can be specified multiple times, and a scenario is included if it has any of the specified tags (OR logic).
# Run only scenarios tagged "smoke"
agenticassure run scenarios/ --adapter my_agent.MyAgent --tag smoke
# Run scenarios tagged "tools" or "regression"
agenticassure run scenarios/ --adapter my_agent.MyAgent --tag tools --tag regressionScenarios without any matching tags are skipped. If no --tag flags are provided, all scenarios are executed.
Suite Filtering
Use --suite to run only scenarios from a specific named suite. This is useful when a directory contains multiple YAML files (each defining a suite) and you want to target just one.
agenticassure run scenarios/ --adapter my_agent.MyAgent --suite search-agent-testsIf the named suite is not found among the loaded files, AgenticAssure prints an error and exits with code 1.
Timeout and Retry
Timeout
The --timeout flag sets the default timeout in seconds for each scenario. If a scenario’s adapter call exceeds this duration, it is marked as failed.
agenticassure run scenarios/ --adapter my_agent.MyAgent --timeout 60Retry
The --retry flag specifies how many times to retry a failed scenario before marking it as failed. This is useful for handling non-deterministic LLM responses.
# Retry each failed scenario up to 2 times
agenticassure run scenarios/ --adapter my_agent.MyAgent --retry 2With --retry 2, a scenario is attempted up to 3 times total (1 initial run + 2 retries).
Dry-Run Mode
The --dry-run flag validates and loads all scenario files, then displays them in a summary table without executing anything. No adapter is required.
agenticassure run scenarios/ --dry-runOutput:
Loaded 5 scenario(s) from 2 suite(s)
Scenarios (dry run)
┌──────────────────┬───────────────┬─────────────────┬──────────┬───────┐
│ Suite │ Scenario │ Input │ Scorers │ Tags │
├──────────────────┼───────────────┼─────────────────┼──────────┼───────┤
│ search-tests │ weather_query │ What is the... │ passfail │ tools │
│ search-tests │ greeting │ Hello, how... │ passfail │ basic │
└──────────────────┴───────────────┴─────────────────┴──────────┴───────┘
5 scenario(s) foundDry-run mode is useful for:
- Verifying that scenario files parse correctly before running.
- Confirming which scenarios match a given
--tagor--suitefilter. - Checking scenario coverage without incurring LLM API costs.
Tag filtering works in dry-run mode:
agenticassure run scenarios/ --dry-run --tag smokeExit Codes
| Exit Code | Meaning |
|---|---|
0 | All executed scenarios passed |
1 | At least one scenario failed, or an error occurred (invalid path, suite not found, adapter import failed) |
When running multiple suites, an overall summary is printed if more than one suite was loaded:
Overall: 8/10 scenarios passed across 3 suite(s)Examples
Basic run with adapter flag:
agenticassure run scenarios/ --adapter my_agent.MyAgentRun a single file with HTML output:
agenticassure run scenarios/search_tests.yaml --adapter my_agent.MyAgent --output htmlRun with retries, longer timeout, and tag filter:
agenticassure run scenarios/ \
--adapter my_agent.MyAgent \
--timeout 60 \
--retry 2 \
--tag regressionRun with adapter from config file (no —adapter flag needed):
# Assuming agenticassure.yaml exists with adapter: my_agent.MyAgent
agenticassure run scenarios/Dry-run to preview what will execute:
agenticassure run scenarios/ --dry-run --suite search-agent-tests --tag toolsJSON output for CI pipelines:
agenticassure run scenarios/ --adapter my_agent.MyAgent --output jsonWhat’s Next
- CLI Report — Understanding the terminal output.
- HTML Report — Generating and sharing HTML reports.
- JSON Report — Structured output for programmatic consumption.
- Adapters — How to write an adapter for your agent.