Troubleshooting
This guide covers common errors, debugging strategies, YAML pitfalls, and frequently asked questions when working with AgenticAssure.
Common Errors and Solutions
”Unknown scorer ‘X’. Available: […]”
Full error:
KeyError: "Unknown scorer 'similarity'. Available: ['passfail', 'exact', 'regex']"Cause: The scorer name referenced in your YAML or code is not registered. This happens when:
- The scorer name is misspelled in your YAML file.
- The scorer requires an optional dependency that is not installed.
- A custom scorer was not registered before running tests.
Solutions:
- Misspelling: Check the scorer name. Built-in scorer names are:
passfail,exact,regex,similarity. - Missing dependency (similarity scorer): The
similarityscorer requiressentence-transformers. Install it:pip install agenticassure[similarity] - Custom scorer not registered: Make sure your custom scorer module is imported before the runner executes. Register it with
register_scorer():from agenticassure.scorers.base import register_scorer register_scorer(MyCustomScorer()) - Verify available scorers:
from agenticassure.scorers.base import list_scorers print(list_scorers())
“Additional properties are not allowed (‘X’ was unexpected)”
Full error:
ValueError: Schema validation failed for scenarios/test.yaml:
Schema: scenarios.0: Additional properties are not allowed ('scorer' was unexpected)Cause: The YAML file contains a property name that is not in the JSON Schema. The schema uses additionalProperties: false at every level, so only known fields are accepted.
Common variations:
| Mistake | Correction |
|---|---|
scorer: passfail | scorers: [passfail] (must be a list, plural) |
timeout: 30 | timeout_seconds: 30 |
expected: "hello" | expected_output: "hello" |
tool_args: {...} | expected_tool_args: {...} |
tools: [...] | expected_tools: [...] |
| Any custom field at root | Move it inside metadata: {...} |
Solution: Check the exact field names against the schema reference. If you need custom data, put it inside the metadata field which accepts arbitrary key-value pairs.
Schema Validation Errors
Full error:
ValueError: Schema validation failed for test.yaml:
Schema: scenarios.0.timeout_seconds: 0 is not valid under any of the given schemas
Schema: scenarios: [] should be non-emptyCommon causes:
| Error message | Cause | Fix |
|---|---|---|
'name' is a required property | Scenario missing name field | Add a name to every scenario |
'input' is a required property | Scenario missing input field | Add an input to every scenario |
'scenarios' is a required property | Missing top-level scenarios key | Add scenarios: at the root level |
[] should be non-empty | Empty scenarios list | Add at least one scenario |
X is not of type 'string' | Wrong type for a field | Check field types (e.g., input must be a string, not a number) |
X is not of type 'array' | A list field has wrong type | Fields like scorers, tags, expected_tools must be YAML lists |
0 is not valid under... | timeout_seconds is zero or negative | Use a positive value |
”Could not import module ‘X’”
Full error:
Error: Could not import module 'mymodule': No module named 'mymodule'
Make sure the module is installed or on your PYTHONPATH.Cause: The adapter path provided via --adapter or config file points to a Python module that cannot be imported.
Solutions:
-
Module not installed: Make sure the package containing your adapter is installed in the current Python environment:
pip install -e . -
Wrong PYTHONPATH: If your adapter is in a local file, make sure the current working directory is on
PYTHONPATH:# On Linux/macOS export PYTHONPATH="$PYTHONPATH:$(pwd)" # On Windows (PowerShell) $env:PYTHONPATH = "$env:PYTHONPATH;$(Get-Location)" -
Wrong dotted path: The adapter path must be in the format
module.ClassName. For example, if your classMyAgentis inmy_project/agent.py, the path ismy_project.agent.MyAgent. -
Virtual environment not activated: Make sure you are running in the correct virtual environment where your dependencies are installed.
”does not implement the AgentAdapter protocol”
Full error:
Error: 'mymodule.MyAgent' does not implement the AgentAdapter protocol.
It must have a run(input, context=None) -> AgentResult method.Cause: Your adapter class exists and can be imported, but it does not have the correct run() method signature.
Requirements for the AgentAdapter protocol:
from agenticassure.results import AgentResult
from typing import Any
class MyAgent:
def run(self, input: str, context: dict[str, Any] | None = None) -> AgentResult:
...Common mistakes:
- Method named something other than
run(e.g.,execute,invoke). - Missing the
contextparameter. - Returning a plain string instead of an
AgentResultobject. - Method is a
@staticmethodor@classmethodinstead of a regular method.
ImportError for sentence-transformers
Full error:
ImportError: sentence-transformers is required for SimilarityScorer.
Install it with: pip install agenticassure[similarity]Cause: The similarity scorer was referenced but the sentence-transformers package is not installed. This is an optional dependency to keep the base package lightweight.
Solution:
pip install agenticassure[similarity]This installs sentence-transformers and its dependencies (including PyTorch). Note that this is a large dependency tree.
If you do not want to install sentence-transformers, remove similarity from your scenario scorers and use other scorers instead (e.g., exact, regex, or passfail).
”No ‘regex_pattern’ found in scenario metadata”
Full error (in ScoreResult explanation):
No 'regex_pattern' found in scenario metadataCause: A scenario uses the regex scorer but does not have a regex_pattern key in its metadata.
Solution: Add the pattern to your scenario’s metadata:
scenarios:
- name: pattern_test
input: "Generate a code"
metadata:
regex_pattern: "[A-Z]{3}-\\d{4}"
scorers:
- regexNote the double backslash in YAML for regex escapes. See YAML Gotchas below.
Empty YAML File Errors
Full error:
ValueError: Empty YAML file: scenarios/empty.yamlCause: The YAML file is empty, contains only whitespace, or contains only comments.
Solution: Add at least a scenarios list with one scenario:
scenarios:
- name: example
input: "Hello"YAML Parse Errors
Full error:
YAML parse error: while parsing a block mapping
in "test.yaml", line 3, column 3
expected <block end>, but found '<scalar>'
in "test.yaml", line 4, column 5Cause: The YAML syntax is malformed. Common causes include:
- Incorrect indentation (YAML uses spaces, not tabs).
- Missing colons after keys.
- Unquoted strings that contain special characters (
:,#,{,},[,]). - Mixing indentation levels within the same block.
Solution: Validate your YAML with a linter or the built-in validate command:
agenticassure validate scenarios/test.yamlSee YAML Gotchas for common YAML pitfalls.
Timeout Errors
Symptoms: Scenarios take a long time and eventually fail, or the process hangs.
Note: The current version of AgenticAssure does not enforce timeouts at the runner level (the timeout_seconds field is available for future use and for adapters to read). Long-running scenarios will block until the adapter’s run() method returns or the underlying HTTP client times out.
Solutions:
-
Set timeouts in your adapter or LLM client:
import openai client = openai.OpenAI(timeout=30.0) -
Use shorter timeout values in your LLM provider configuration.
-
Add retry logic via the
retriessetting to recover from transient timeouts:suite: name: tests config: retries: 2
Connection Errors to LLM APIs
Symptoms:
ConnectionError: Error communicating with the OpenAI APIor
openai.APIConnectionError: Connection error.Solutions:
-
Check your API key is set correctly:
echo $OPENAI_API_KEY # Linux/macOS echo %OPENAI_API_KEY% # Windows CMD -
Check network connectivity — ensure you can reach the API endpoint.
-
Check rate limits — if you are running many scenarios, you may hit rate limits. Add retries:
agenticassure run scenarios/ --adapter mymodule.MyAgent --retry 2 -
Use a proxy if you are behind a corporate firewall. Configure it via environment variables:
export HTTPS_PROXY=http://proxy.example.com:8080
HF Hub Rate Limit Warnings
Symptoms:
huggingface_hub.utils._errors.HfHubHTTPError: 429 Client Error: Too Many Requestsor frequent warnings about rate limiting when using the similarity scorer.
Solution: Set the HF_TOKEN environment variable with a Hugging Face access token to get higher rate limits:
export HF_TOKEN=hf_your_token_hereYou can create a token at https://huggingface.co/settings/tokens .
Debugging Tips
Use --dry-run to Validate Without Running
The --dry-run flag loads and validates your scenario files, then displays a summary table without executing any scenarios. This is useful for catching YAML errors and verifying tag filters.
agenticassure run scenarios/ --dry-run
agenticassure run scenarios/ --dry-run --tag smokeIf no adapter is configured, the run command automatically falls back to dry-run behavior.
Use the validate Command
The validate command checks YAML files for structural and semantic issues:
# Validate a single file
agenticassure validate scenarios/test.yaml
# Validate all YAML files in a directory
agenticassure validate scenarios/Output shows OK for valid files and FAIL with specific issues for invalid ones.
Check Scorer Registration with list_scorers()
If you are unsure which scorers are available in your environment, check at runtime:
from agenticassure.scorers.base import list_scorers
print(list_scorers())Expected output with all dependencies installed:
['passfail', 'exact', 'regex', 'similarity']If similarity is missing, install sentence-transformers:
pip install agenticassure[similarity]Test Your Adapter Independently
Before using your adapter with AgenticAssure, test it in isolation:
from agenticassure import AgentResult
from my_module import MyAgent
agent = MyAgent()
result = agent.run("Hello, how are you?")
# Verify it returns an AgentResult
assert isinstance(result, AgentResult), f"Expected AgentResult, got {type(result)}"
print(f"Output: {result.output}")
print(f"Tool calls: {result.tool_calls}")Inspect Results Programmatically
For deeper debugging, use the Python API instead of the CLI:
from agenticassure.runner import Runner
from agenticassure.loader import load_scenarios
suite = load_scenarios("scenarios/test.yaml")
runner = Runner(adapter=my_adapter)
result = runner.run_suite(suite)
for sr in result.scenario_results:
print(f"\n--- {sr.scenario.name} ---")
print(f"Passed: {sr.passed}")
print(f"Duration: {sr.duration_ms:.0f}ms")
print(f"Agent output: {sr.agent_result.output[:200]}")
if sr.error:
print(f"Error: {sr.error}")
for score in sr.scores:
print(f" Scorer '{score.scorer_name}': score={score.score}, passed={score.passed}")
print(f" Explanation: {score.explanation}")
if score.details:
print(f" Details: {score.details}")Use JSON Output for CI Integration
The JSON output format provides machine-readable results for CI pipelines:
agenticassure run scenarios/ --adapter mymodule.MyAgent --output jsonThis writes a results_<run_id>.json file that can be parsed by downstream tools.
YAML Gotchas
Backslash Escaping in Regex Patterns
YAML interprets backslashes in double-quoted strings. When writing regex patterns, you need to double-escape:
# WRONG -- YAML interprets \d as an escape sequence
metadata:
regex_pattern: "\d{3}-\d{4}"
# CORRECT -- double backslash in double-quoted strings
metadata:
regex_pattern: "\\d{3}-\\d{4}"
# ALSO CORRECT -- single-quoted strings do not process escapes
metadata:
regex_pattern: '\d{3}-\d{4}'
# ALSO CORRECT -- unquoted (works for simple patterns)
metadata:
regex_pattern: \d{3}-\d{4}For complex regex patterns, single-quoted strings are recommended as they preserve backslashes literally.
String Quoting
YAML has nuanced rules about when strings need quoting:
# These are fine unquoted
input: Hello world
input: What is the weather?
# These NEED quoting (special characters)
input: "What is the status of order #123?" # '#' starts a comment
input: "key: value" # ':' followed by space is a mapping
input: "Use [brackets] carefully" # '[' starts a flow sequence
input: "{braces} too" # '{' starts a flow mapping
input: "yes" # Without quotes, YAML reads this as boolean true
input: "3.14" # Without quotes, YAML reads this as a float
input: "null" # Without quotes, YAML reads this as null/NoneWhen in doubt, use double quotes around your strings.
Indentation
YAML uses spaces for indentation (tabs are not allowed). Inconsistent indentation causes parse errors.
# CORRECT -- consistent 2-space indentation
scenarios:
- name: test
input: "Hello"
scorers:
- passfail
# WRONG -- tab indentation (invisible but breaks YAML)
scenarios:
- name: test
input: "Hello"
# WRONG -- inconsistent indentation
scenarios:
- name: test
input: "Hello" # 3 spaces instead of 4
scorers:
- passfailscorers (List) vs scorer (Invalid)
A common mistake is using the singular form:
# WRONG -- "scorer" is not a recognized field
scenarios:
- name: test
input: "Hello"
scorer: passfail
# CORRECT -- must be "scorers" (plural) with list syntax
scenarios:
- name: test
input: "Hello"
scorers:
- passfail
# ALSO CORRECT -- inline list syntax
scenarios:
- name: test
input: "Hello"
scorers: [passfail, exact]Using scorer will produce the error: Additional properties are not allowed ('scorer' was unexpected).
Multiline Strings
YAML supports multiline strings with | (literal block) and > (folded block):
scenarios:
- name: long_prompt
input: |
You are a helpful assistant.
The user wants to know about photosynthesis.
Please explain it in simple terms.
expected_output: "photosynthesis"The | preserves newlines. The > folds newlines into spaces (useful for long paragraphs).
FAQ
Can I use multiple scorers on a single scenario?
Yes. List all desired scorer names in the scorers field. A scenario passes only if all scorers pass.
scenarios:
- name: multi_scored
input: "What is the capital of France?"
expected_output: "Paris"
metadata:
regex_pattern: "Paris"
scorers:
- passfail
- exact
- regexHow do I test without an LLM? (Mock adapter)
Create a simple adapter that returns canned responses:
from agenticassure import AgentResult
class MockAgent:
"""Returns predefined responses for testing."""
def __init__(self, responses: dict[str, str] | None = None):
self.responses = responses or {}
self.default_response = "This is a mock response."
def run(self, input: str, context=None) -> AgentResult:
output = self.responses.get(input, self.default_response)
return AgentResult(output=output)
# Usage
from agenticassure.runner import Runner
from agenticassure.loader import load_scenarios
agent = MockAgent(responses={
"Hello": "Hi there! How can I help?",
"What is 2+2?": "4",
})
runner = Runner(adapter=agent)
suite = load_scenarios("scenarios/test.yaml")
result = runner.run_suite(suite)This is useful for testing your scenario definitions and scorer configurations without incurring LLM API costs.
How do I skip slow tests?
Use tags to categorize scenarios and filter them at runtime:
scenarios:
- name: fast_test
input: "Quick check"
tags: [smoke]
scorers: [passfail]
- name: slow_integration
input: "Complex multi-step task"
tags: [integration, slow]
scorers: [passfail, similarity]Then run only the fast tests:
agenticassure run scenarios/ --adapter mymodule.MyAgent --tag smokeOr run specific tag combinations programmatically:
result = runner.run_suite(suite, tags=["smoke"])Can I run scenarios in parallel?
Not currently. The Runner executes scenarios sequentially. Parallel execution may be added in a future release.
If you need parallel execution now, you can split your suites into separate files and run them in parallel processes:
# Run multiple suites in parallel (bash)
agenticassure run scenarios/suite1.yaml --adapter mymodule.MyAgent &
agenticassure run scenarios/suite2.yaml --adapter mymodule.MyAgent &
waitOr use Python’s concurrent.futures with the programmatic API:
from concurrent.futures import ThreadPoolExecutor
from agenticassure.runner import Runner
from agenticassure.loader import load_scenarios_from_dir
suites = load_scenarios_from_dir("scenarios/")
def run_suite(suite):
runner = Runner(adapter=MyAgent())
return runner.run_suite(suite)
with ThreadPoolExecutor(max_workers=4) as executor:
results = list(executor.map(run_suite, suites))Note: Make sure your adapter is thread-safe if using this approach.
What Python versions are supported?
AgenticAssure requires Python 3.10 or later. It uses features introduced in Python 3.10 such as the X | Y union type syntax.
How do I pass extra context to my adapter?
Use the context parameter on run_suite or run_scenario:
result = runner.run_suite(suite, context={
"user_id": "test-user",
"session_id": "abc123",
"temperature": 0.0,
})Your adapter receives this context in its run() method:
class MyAgent:
def run(self, input: str, context=None) -> AgentResult:
user_id = context.get("user_id") if context else None
# Use context in your agent logic
...How do I use the similarity scorer with a different model?
Programmatically, create a custom SimilarityScorer instance:
from agenticassure.scorers.similarity import SimilarityScorer
from agenticassure.scorers.base import register_scorer
# Override with a different model
custom_scorer = SimilarityScorer(
model_name="all-mpnet-base-v2",
threshold=0.8,
)
register_scorer(custom_scorer) # Replaces the default "similarity" scorerPer-scenario, you can override the threshold via metadata:
scenarios:
- name: strict_similarity
input: "Explain gravity"
expected_output: "Gravity is a fundamental force..."
metadata:
similarity_threshold: 0.9
scorers:
- similarityWhy does my scenario fail even though the output looks correct?
Check which scorers are configured and what they are checking:
- passfail with
expected_outputdoes a case-insensitive substring match. The expected text must appear somewhere in the output. - exact compares the entire output (normalized by default). Extra text causes a mismatch.
- regex requires a pattern in metadata. Without it, the scorer always fails.
- similarity computes semantic similarity. Low scores may indicate the model’s embedding does not consider the texts similar.
Use the programmatic API to inspect individual scorer results:
for score in scenario_result.scores:
print(f"{score.scorer_name}: passed={score.passed}, explanation={score.explanation}")Can I generate reports in multiple formats at once?
The CLI supports one output format per run (--output cli|json|html). To generate multiple formats, run the command multiple times or use the Python API:
from agenticassure.reports import CLIReporter, HTMLReporter, JSONReporter
# Generate all three reports from the same result
CLIReporter().report(result)
HTMLReporter().report(result, output_path="report.html")
JSONReporter().report(result, output_path="results.json")