Skip to Content
Changelog

Changelog

All notable changes to AgenticAssure are documented in this file.

This project uses Semantic Versioning . During the 0.x series, the API is under active development and may change between minor versions.


v0.3.0

Release date: 2025

Bug Fixes

  • Fixed similarity scorer not registering. The similarity module was not imported in scorers/__init__.py, so the SimilarityScorer was never registered even when sentence-transformers was installed. The scorer now registers correctly when the optional dependency is available.

Details

  • Added from agenticassure.scorers.similarity import SimilarityScorer to scorers/__init__.py (guarded by contextlib.suppress(ImportError) so it does not fail when sentence-transformers is not installed).
  • The similarity scorer now appears in list_scorers() output when sentence-transformers is installed.

v0.2.0

Release date: 2025

Bug Fixes

  • Fixed truncation in CLI reports. The details column in Rich table output was being truncated with ”…” for long text. Details are now displayed in full.
  • Fixed truncation in HTML reports. CSS text-overflow: ellipsis was clipping long content in HTML report cells. Removed the truncation so full text is visible.
  • Fixed truncation in CLI list and dry-run output. The input column in the list command and --dry-run output was being truncated. Inputs are now shown in full.
  • Fixed truncation in JSON list output. JSON output from the list command was truncating string values. Full values are now preserved.

v0.1.0

Release date: 2025

Initial Release

The first public release of AgenticAssure, providing a complete framework for testing and benchmarking LLM-powered AI agents.

Features

Data Models

  • Scenario, Suite, SuiteConfig — Test scenario and suite definitions using Pydantic v2.
  • ToolCall, TokenUsage, AgentResult — Agent execution result models.
  • ScoreResult, ScenarioRunResult, RunResult — Scoring and aggregation result models.

YAML Loader

  • load_scenarios() — Load a test suite from a YAML file.
  • load_scenarios_from_dir() — Recursively load all YAML files from a directory.
  • validate_scenario_file() — Validate YAML files and return a list of issues.
  • validate_with_schema() — Validate parsed data against the built-in JSON Schema.
  • SUITE_SCHEMA — JSON Schema (Draft 2020-12) for scenario file validation.

Adapters

  • AgentAdapter protocol — Runtime-checkable protocol for agent adapters.
  • OpenAIAdapter — Built-in adapter for OpenAI chat completions with function calling support.
  • LangChainAdapter — Built-in adapter for LangChain AgentExecutor.

Scorers

  • Scorer protocol — Runtime-checkable protocol for custom scorers.
  • PassFailScorer (passfail) — Checks non-empty output, expected tools, expected tool arguments, and expected output substring.
  • ExactMatchScorer (exact) — Exact string comparison with optional normalization.
  • RegexScorer (regex) — Regex pattern matching from scenario metadata.
  • SimilarityScorer (similarity) — Semantic similarity using sentence-transformers (optional dependency).
  • Scorer registry with register_scorer(), get_scorer(), and list_scorers().

Runner

  • Runner class — Sequential test runner with retry logic and fail-fast support.
  • run_suite() — Execute all scenarios in a suite with optional tag filtering and context passing.
  • run_scenario() — Execute a single scenario.
  • Suite-level config override support.

CLI

  • agenticassure run — Run test scenarios with output format selection (cli, json, html).
  • agenticassure init — Scaffold a new project with example scenarios.
  • agenticassure validate — Validate YAML scenario files without running them.
  • agenticassure list — List all scenarios with metadata (table or JSON output).
  • --dry-run mode for validation without execution.
  • --tag filtering for selective test execution.
  • Config file support (agenticassure.yaml / agenticassure.toml) for adapter specification.

Reports

  • CLIReporter — Rich terminal output with colored tables.
  • HTMLReporter — Self-contained HTML report files.
  • JSONReporter — Machine-readable JSON report files.

Testing

  • 48 tests passing.
  • Zero ruff violations.
  • Black-formatted codebase.
Last updated on