Changelog

All notable changes to AgenticAssure are documented in this file.

This project uses Semantic Versioning . During the 0.x series, the API is under active development and may change between minor versions.

v0.3.0

Release date: 2025

Fixed similarity scorer not registering. The similarity module was not imported in scorers/__init__.py, so the SimilarityScorer was never registered even when sentence-transformers was installed. The scorer now registers correctly when the optional dependency is available.

Added from agenticassure.scorers.similarity import SimilarityScorer to scorers/__init__.py (guarded by contextlib.suppress(ImportError) so it does not fail when sentence-transformers is not installed).
The similarity scorer now appears in list_scorers() output when sentence-transformers is installed.

Release date: 2025

Fixed truncation in CLI reports. The details column in Rich table output was being truncated with ”…” for long text. Details are now displayed in full.
Fixed truncation in HTML reports. CSS text-overflow: ellipsis was clipping long content in HTML report cells. Removed the truncation so full text is visible.
Fixed truncation in CLI list and dry-run output. The input column in the list command and --dry-run output was being truncated. Inputs are now shown in full.
Fixed truncation in JSON list output. JSON output from the list command was truncating string values. Full values are now preserved.

Release date: 2025

The first public release of AgenticAssure, providing a complete framework for testing and benchmarking LLM-powered AI agents.

Scenario, Suite, SuiteConfig — Test scenario and suite definitions using Pydantic v2.
ToolCall, TokenUsage, AgentResult — Agent execution result models.
ScoreResult, ScenarioRunResult, RunResult — Scoring and aggregation result models.

load_scenarios() — Load a test suite from a YAML file.
load_scenarios_from_dir() — Recursively load all YAML files from a directory.
validate_scenario_file() — Validate YAML files and return a list of issues.
validate_with_schema() — Validate parsed data against the built-in JSON Schema.
SUITE_SCHEMA — JSON Schema (Draft 2020-12) for scenario file validation.

AgentAdapter protocol — Runtime-checkable protocol for agent adapters.
OpenAIAdapter — Built-in adapter for OpenAI chat completions with function calling support.
LangChainAdapter — Built-in adapter for LangChain AgentExecutor.

Scorer protocol — Runtime-checkable protocol for custom scorers.
PassFailScorer (passfail) — Checks non-empty output, expected tools, expected tool arguments, and expected output substring.
ExactMatchScorer (exact) — Exact string comparison with optional normalization.
RegexScorer (regex) — Regex pattern matching from scenario metadata.
SimilarityScorer (similarity) — Semantic similarity using sentence-transformers (optional dependency).
Scorer registry with register_scorer(), get_scorer(), and list_scorers().

Runner class — Sequential test runner with retry logic and fail-fast support.
run_suite() — Execute all scenarios in a suite with optional tag filtering and context passing.
run_scenario() — Execute a single scenario.
Suite-level config override support.

agenticassure run — Run test scenarios with output format selection (cli, json, html).
agenticassure init — Scaffold a new project with example scenarios.
agenticassure validate — Validate YAML scenario files without running them.
agenticassure list — List all scenarios with metadata (table or JSON output).
--dry-run mode for validation without execution.
--tag filtering for selective test execution.
Config file support (agenticassure.yaml / agenticassure.toml) for adapter specification.