Skip to main content

Testing

When to use this

Use the testing utilities to write fast, deterministic unit and integration tests for your graphs. TestAgent replaces Agent with a mock that returns pre-defined responses, so your tests never hit the LLM API.

Import paths

from agentflow.qa.testing import (
TestAgent,
QuickTest,
TestResult,
MockMCPClient,
MockToolRegistry,
InMemoryStore,
)

TestAgent

A drop-in replacement for Agent that returns predefined responses in order.

from agentflow.qa.testing import TestAgent

agent = TestAgent(
model="test-model",
responses=["Hello from the test agent!", "How can I help?"],
)

Constructor parameters

ParameterTypeDefaultDescription
modelstr"test-model"Model identifier string (unused — for API compatibility only).
responseslist[str | dict][]Predefined responses returned in sequence. Each invocation consumes the next item.
system_promptlist[dict] | NoneNoneAccepted for API compatibility. Not used.
**kwargsanyAll other Agent constructor kwargs are accepted and silently ignored.

Attributes

AttributeTypeDescription
call_countintNumber of times the agent was invoked.
call_historylist[dict]Details of each invocation (state, config).

Tool call simulation

Pass raw tool call dicts in the responses list to simulate the agent calling a tool:

agent = TestAgent(
responses=[
{
"content": "",
"tools_calls": [{
"id": "call_abc",
"function": {
"name": "get_weather",
"arguments": {"location": "London"},
},
}],
},
"The weather in London is sunny and 22°C.",
]
)

Using in a graph

from agentflow.core.graph import StateGraph, ToolNode
from agentflow.qa.testing import TestAgent
from agentflow.utils import START, END

test_agent = TestAgent(responses=["Hello!"])

graph = StateGraph()
graph.add_node("MAIN", test_agent)
graph.set_entry_point("MAIN")
graph.add_edge("MAIN", END)

app = graph.compile()
result = await app.ainvoke({"messages": [Message.text_message("Hi")]})

QuickTest

High-level helper that builds, compiles, and runs a graph in one call. Reduces test boilerplate from 20 lines to 3.

from agentflow.qa.testing import QuickTest

result = await QuickTest.single_turn(
agent_response="Hello! How can I help?",
user_message="Hi there",
)
result.assert_contains("Hello")

QuickTest.single_turn

result = await QuickTest.single_turn(
agent_response="The capital is Paris.",
user_message="What is the capital of France?",
model="test-model",
config={"thread_id": "test-1"},
)

Builds a one-node TestAgent → END graph and runs one turn. Returns TestResult.

QuickTest.multi_turn

result = await QuickTest.multi_turn(
conversation=[
("Hello", "Hi there!"),
("What can you do?", "I can answer questions."),
("Tell me a joke", "Why did the chicken cross the road?"),
],
)
result.assert_contains("chicken")

Builds a one-node graph and replays each turn in sequence, accumulating messages in state. Returns the TestResult from the final turn.

QuickTest.with_tools

result = await QuickTest.with_tools(
query="What's the weather in Paris?",
response="It's sunny and 22°C in Paris.",
tools=["get_weather"],
)
result.assert_tool_called("get_weather")

Builds a TestAgent → ToolNode → TestAgent → END pattern and verifies tools were invoked. Returns TestResult.


TestResult

Returned by QuickTest and wraps the raw graph output with assertion helpers.

result.assert_contains("Hello")
result.assert_not_contains("Error")
result.assert_equals("Exact expected response")
result.assert_tool_called("get_weather")
result.assert_tool_called("search", query="weather") # with arg check
result.assert_tool_not_called("send_email")
result.assert_message_count(3)
result.assert_no_errors()

Attributes

AttributeTypeDescription
final_responsestrThe last text response from the agent.
messageslist[Message]All messages from the run.
tool_callslist[dict]All tool calls made during the run.
statedictFull output dict from ainvoke().
passedboolTrue unless an assertion has failed.

Methods

MethodDescription
assert_contains(text)final_response must contain text.
assert_not_contains(text)final_response must NOT contain text.
assert_equals(expected)final_response must equal expected exactly.
assert_tool_called(name, **args)A tool with name must have been called. Pass **args to also check argument values.
assert_tool_not_called(name)A tool with name must NOT have been called.
assert_message_count(n)Total message count must equal n.
assert_no_errors()No ErrorBlock in any message.

All assertion methods return self for chaining.


MockMCPClient

Fake MCP client for testing nodes that use MCP tools.

from agentflow.qa.testing import MockMCPClient

mock_mcp = MockMCPClient()
mock_mcp.add_tool(
name="search_docs",
description="Search the documentation",
parameters={"query": {"type": "string", "description": "Search query"}},
handler=lambda query: f"Found results for: {query}",
)

tools = ToolNode(client=mock_mcp)
graph.add_node("TOOL", tools)

MockToolRegistry

Tracks which tools were called and with what arguments. Useful for asserting tool behaviour in integration tests.

from agentflow.qa.testing import MockToolRegistry

registry = MockToolRegistry()
registry.register("get_weather", lambda location: f"Sunny in {location}")
registry.register("search", lambda q: [{"title": "Result", "url": "http://..."}])

tools = ToolNode(list(registry.funcs.values()))

# ... run the graph ...

assert registry.was_called("get_weather")
assert registry.get_call_count("get_weather") == 1
args = registry.get_last_args("get_weather")
assert args["location"] == "London"

InMemoryStore (testing)

A lightweight in-memory BaseStore for tests that need memory functionality without a real vector database.

from agentflow.qa.testing import InMemoryStore

store = InMemoryStore()
app = graph.compile(store=store)

Stores memories in a Python dict. Similarity search returns all stored memories sorted by insertion order (no embeddings). For tests that only care about whether memories are stored and retrieved.


Full test example

import pytest
from agentflow.core.graph import StateGraph, ToolNode
from agentflow.core.state import AgentState, Message
from agentflow.qa.testing import TestAgent, TestResult, QuickTest
from agentflow.utils import END


@pytest.mark.asyncio
async def test_single_response():
result = await QuickTest.single_turn(
agent_response="The capital of France is Paris.",
user_message="What is the capital of France?",
)
result.assert_contains("Paris")


@pytest.mark.asyncio
async def test_tool_call_path():
"""Verify the agent calls the weather tool."""
result = await QuickTest.with_tools(
query="What's the weather in Tokyo?",
response="It's rainy and 18°C in Tokyo.",
tools=["get_weather"],
)
result.assert_tool_called("get_weather")
result.assert_contains("Tokyo")


@pytest.mark.asyncio
async def test_custom_state_preserved():
"""Verify custom state fields survive a round-trip."""
from agentflow.utils import ResponseGranularity
from pydantic import Field

class CustomerState(AgentState):
customer_id: str = ""

agent = TestAgent(responses=["Done processing order."])
graph = StateGraph(CustomerState(customer_id="CUST-001"))
graph.add_node("MAIN", agent)
graph.set_entry_point("MAIN")
graph.add_edge("MAIN", END)

app = graph.compile()
result = await app.ainvoke(
{"messages": [Message.text_message("Process my order")]},
response_granularity=ResponseGranularity.FULL,
)

state: CustomerState = result["state"]
assert state.customer_id == "CUST-001"