Writing Adapters

An adapter is the bridge between AgenticAssure and your AI agent. It takes a text input, runs your agent, and returns a structured AgentResult containing the output, tool calls, latency, and other metadata. Every agent tested with AgenticAssure needs an adapter.

The AgentAdapter Protocol

AgenticAssure uses Python’s structural subtyping (the Protocol pattern) for adapters. You do not need to inherit from a base class — you just need to implement the run method with the correct signature.


from typing import Any, Protocol, runtime_checkable
from agenticassure.results import AgentResult
 
@runtime_checkable
class AgentAdapter(Protocol):
    """Protocol that all agent adapters must implement."""
 
    def run(self, input: str, context: dict[str, Any] | None = None) -> AgentResult:
        """Execute the agent with the given input and return structured results."""
        ...

Parameters

input (str): The user message or prompt to send to the agent. This comes from the input field of the scenario.
context (dict[str, Any] | None): Optional context dictionary that can carry additional information into the agent. Defaults to None.

Return Value

The method must return an AgentResult instance:


from agenticassure.results import AgentResult, ToolCall, TokenUsage
 
class AgentResult:
    output: str                           # Required. The agent's text response.
    tool_calls: list[ToolCall] = []       # Tool calls made during execution.
    reasoning_trace: list[str] | None     # Optional reasoning steps.
    latency_ms: float = 0.0              # Execution time in milliseconds.
    token_usage: TokenUsage | None        # Optional token usage statistics.
    raw_response: Any | None              # Optional raw response from the LLM.

At minimum, you must set the output field. Everything else is optional and depends on what information your agent makes available.

Minimal Adapter Example

The simplest possible adapter:


from agenticassure.results import AgentResult
 
class MyAgent:
    def run(self, input, context=None):
        # Call your agent however you normally would
        response = my_agent_function(input)
        return AgentResult(output=response)

This is enough to run scenarios that use the passfail scorer with expected_output. It does not report tool calls, latency, or token usage.

Returning Tool Calls

If your agent uses tool calling (function calling), capture the tool invocations and return them as ToolCall objects. This enables AgenticAssure to verify expected_tools and expected_tool_args.


from agenticassure.results import AgentResult, ToolCall
 
class MyToolAgent:
    def run(self, input, context=None):
        # Run your agent and collect tool call information
        result = self.agent.invoke(input)
 
        tool_calls = []
        for call in result.tool_invocations:
            tool_calls.append(
                ToolCall(
                    name=call.function_name,
                    arguments=call.args,
                    result=call.return_value,  # optional
                )
            )
 
        return AgentResult(
            output=result.final_answer,
            tool_calls=tool_calls,
        )

ToolCall Fields

Field	Type	Required	Description
`name`	`str`	Yes	The name of the tool/function that was called.
`arguments`	`dict[str, Any]`	No	The arguments passed to the tool. Defaults to `{}`.
`result`	`Any`	No	The return value of the tool call. Defaults to `None`.

Returning Token Usage

Token usage tracking helps you monitor costs and identify expensive scenarios.


from agenticassure.results import AgentResult, TokenUsage
 
class MyTrackedAgent:
    def run(self, input, context=None):
        response = self.client.complete(input)
 
        return AgentResult(
            output=response.text,
            token_usage=TokenUsage(
                prompt_tokens=response.usage.input_tokens,
                completion_tokens=response.usage.output_tokens,
            ),
        )

TokenUsage provides a computed total_tokens property:


usage = TokenUsage(prompt_tokens=100, completion_tokens=50)
print(usage.total_tokens)  # 150

Using the Context Parameter

The context parameter allows you to pass additional data into your agent from the test runner. This is useful for:

Providing session state or conversation history
Passing user identity or permissions
Injecting configuration that varies between test runs


class ContextAwareAgent:
    def run(self, input, context=None):
        user_id = context.get("user_id") if context else None
        session = context.get("session") if context else None
 
        response = self.agent.invoke(
            input,
            user_id=user_id,
            session=session,
        )
 
        return AgentResult(output=response.text)

Context is passed from the runner when calling run_suite() or run_scenario():


runner = Runner(adapter=my_adapter)
result = runner.run_suite(suite, context={"user_id": "test-user-001"})

Error Handling in Adapters

Your adapter’s run method should either return a valid AgentResult or raise an exception. Do not return None or invalid data.

Let Exceptions Propagate

The simplest approach is to let exceptions bubble up. The runner catches them and records the error in the scenario result:


class SimpleAgent:
    def run(self, input, context=None):
        # If this raises, the runner catches it and marks the scenario as failed
        response = self.client.complete(input)
        return AgentResult(output=response.text)

Handle Expected Errors Gracefully

If your agent should handle certain errors gracefully (e.g., returning an error message to the user instead of crashing), handle those in the adapter:


class RobustAgent:
    def run(self, input, context=None):
        try:
            response = self.client.complete(input)
            return AgentResult(output=response.text)
        except RateLimitError:
            # Return an error message as output -- this is how the agent
            # would behave in production
            return AgentResult(output="I'm currently experiencing high demand. Please try again.")
        except InvalidInputError as e:
            return AgentResult(output=f"I couldn't process that input: {e}")
        # Let unexpected errors propagate to the runner

Retry Behavior

The runner supports retries (configured per-suite or via the --retry CLI flag). When a scenario fails due to an exception, the runner retries the entire run() call. Your adapter does not need to implement its own retry logic unless you have specific retry requirements.

Testing Your Adapter Independently

Before using your adapter with AgenticAssure, verify it works correctly in isolation:


from myproject.adapter import MyAgent
from agenticassure.adapters.base import AgentAdapter
 
def test_adapter_protocol():
    agent = MyAgent()
    # Verify it satisfies the protocol
    assert isinstance(agent, AgentAdapter)
 
def test_adapter_returns_result():
    agent = MyAgent()
    result = agent.run("Hello, world!")
    assert isinstance(result, AgentResult)
    assert len(result.output) > 0
 
def test_adapter_tool_calls():
    agent = MyAgent()
    result = agent.run("Look up order #12345")
    assert len(result.tool_calls) > 0
    assert result.tool_calls[0].name == "lookup_order"
 
def test_adapter_with_context():
    agent = MyAgent()
    result = agent.run("What is my balance?", context={"user_id": "test-123"})
    assert isinstance(result, AgentResult)

Run these tests with pytest before integrating with AgenticAssure scenario files.

Full Example: Wrapping an OpenAI Assistants Agent

This example shows how to wrap an OpenAI Assistants API agent (not the simple chat completion API, but the Assistants API with threads and runs).


import json
import time
from typing import Any
 
from openai import OpenAI
 
from agenticassure.results import AgentResult, TokenUsage, ToolCall
 
 
class OpenAIAssistantAdapter:
    """Adapter for an OpenAI Assistants API agent."""
 
    def __init__(self, assistant_id: str, api_key: str | None = None):
        self.client = OpenAI(api_key=api_key) if api_key else OpenAI()
        self.assistant_id = assistant_id
 
    def run(self, input: str, context: dict[str, Any] | None = None) -> AgentResult:
        start = time.perf_counter()
 
        # Create a new thread for each test run
        thread = self.client.beta.threads.create()
 
        # Add the user message
        self.client.beta.threads.messages.create(
            thread_id=thread.id,
            role="user",
            content=input,
        )
 
        # Run the assistant
        run = self.client.beta.threads.runs.create_and_poll(
            thread_id=thread.id,
            assistant_id=self.assistant_id,
        )
 
        # Collect tool calls if the run required action
        tool_calls = []
        if run.required_action and run.required_action.submit_tool_outputs:
            for tool_call in run.required_action.submit_tool_outputs.tool_calls:
                tool_calls.append(
                    ToolCall(
                        name=tool_call.function.name,
                        arguments=json.loads(tool_call.function.arguments),
                    )
                )
 
        # Get the assistant's response
        messages = self.client.beta.threads.messages.list(thread_id=thread.id)
        assistant_messages = [m for m in messages if m.role == "assistant"]
 
        output = ""
        if assistant_messages:
            last_message = assistant_messages[0]
            for block in last_message.content:
                if block.type == "text":
                    output += block.text.value
 
        latency_ms = (time.perf_counter() - start) * 1000
 
        # Get token usage from the run
        token_usage = None
        if run.usage:
            token_usage = TokenUsage(
                prompt_tokens=run.usage.prompt_tokens,
                completion_tokens=run.usage.completion_tokens,
            )
 
        return AgentResult(
            output=output,
            tool_calls=tool_calls,
            latency_ms=latency_ms,
            token_usage=token_usage,
            raw_response={"run_id": run.id, "thread_id": thread.id},
        )

Usage in agenticassure.yaml:


adapter: myproject.adapters.OpenAIAssistantAdapter

Or from the CLI:


agenticassure run scenarios/ --adapter myproject.adapters.OpenAIAssistantAdapter

Note: The adapter class must be instantiable with no arguments when used from the CLI. If your adapter requires constructor arguments (like assistant_id), use a wrapper class or factory:


class MyAssistantAdapter(OpenAIAssistantAdapter):
    def __init__(self):
        super().__init__(assistant_id="asst_abc123")

Full Example: Wrapping a Custom Agent

This example shows how to wrap an arbitrary custom agent that has its own interface.


import time
from typing import Any
 
from agenticassure.results import AgentResult, ToolCall
 
 
class CustomAgentAdapter:
    """Adapter for a custom in-house agent."""
 
    def __init__(self):
        # Initialize your agent however needed
        from mycompany.agents import SupportBot
        self.bot = SupportBot(
            model="our-fine-tuned-model",
            tools=["search_kb", "create_ticket", "lookup_customer"],
        )
 
    def run(self, input: str, context: dict[str, Any] | None = None) -> AgentResult:
        start = time.perf_counter()
 
        # Call your agent's native interface
        bot_response = self.bot.process_message(
            message=input,
            user_context=context or {},
        )
 
        latency_ms = (time.perf_counter() - start) * 1000
 
        # Map your agent's tool call format to AgenticAssure's ToolCall model
        tool_calls = []
        for action in bot_response.actions_taken:
            tool_calls.append(
                ToolCall(
                    name=action["tool"],
                    arguments=action.get("params", {}),
                    result=action.get("output"),
                )
            )
 
        # Map reasoning steps if your agent exposes them
        reasoning = None
        if hasattr(bot_response, "thought_process"):
            reasoning = [str(step) for step in bot_response.thought_process]
 
        return AgentResult(
            output=bot_response.reply,
            tool_calls=tool_calls,
            reasoning_trace=reasoning,
            latency_ms=latency_ms,
        )

Adapter Checklist

Before deploying your adapter, verify the following:

The class has a run(self, input: str, context: dict[str, Any] | None = None) -> AgentResult method.
The method returns an AgentResult with at least the output field populated.
If your agent uses tools, tool_calls is populated with ToolCall objects.
If your agent uses tools, ToolCall.name matches the tool names you use in expected_tools.
If used from the CLI, the class can be instantiated with no constructor arguments.
isinstance(your_adapter, AgentAdapter) returns True.
The adapter works in isolation (test it with a simple script before integrating).