Testing Utilities¶
AgentFlow provides comprehensive testing utilities to make unit and integration testing of agents fast, predictable, and easy.
Why Special Testing Utilities?¶
Testing AI agents presents unique challenges:
- LLM API calls are slow and expensive - You don't want to call real LLMs in every test
- Responses are non-deterministic - Same input can produce different outputs
- Complex setup - Graphs, nodes, tools, state management requires boilerplate
- Tool integration testing - Need to mock MCP servers, external APIs
AgentFlow's testing module solves these problems with:
- TestAgent - Mock agent that returns predefined responses (no API calls)
- QuickTest - One-liner tests for common patterns
- TestContext - Isolated test environments with automatic cleanup
- Mock Tools - MockToolRegistry, MockMCPClient for tool testing
- TestResult - Chainable assertions for fluent test writing
Quick Start¶
Simple Unit Test¶
from agentflow.testing import TestAgent, QuickTest
# Test a single interaction
result = await QuickTest.single_turn(
agent_response="Hello! How can I help you today?",
user_message="Hi there",
)
# Chain assertions
result.assert_contains("help").assert_not_contains("error")
Test Your Graph¶
from agentflow.testing import TestAgent, TestContext
from agentflow.utils.constants import END
# Use TestContext for isolated setup
with TestContext() as ctx:
# Create graph
graph = ctx.create_graph()
# Add test agent (no real LLM calls!)
test_agent = ctx.create_test_agent(
responses=["Hi! I'm a weather assistant."]
)
graph.add_node("MAIN", test_agent)
graph.set_entry_point("MAIN")
graph.add_edge("MAIN", END)
# Compile and test
compiled = graph.compile()
result = await compiled.ainvoke({"messages": [...]})
# Assertions
test_agent.assert_called()
assert "weather assistant" in result["messages"][-1].text()
Core Components¶
TestAgent¶
Mock agent that returns predefined responses without calling LLMs:
from agentflow.testing import TestAgent
# Create test agent
test_agent = TestAgent(
model="test-model", # For compatibility
responses=["Response 1", "Response 2", "Response 3"], # Cycles through
)
# Use in graph (drop-in replacement for Agent)
graph.add_node("MAIN", test_agent)
# After running
test_agent.assert_called()
test_agent.assert_called_times(3)
assert "Response 1" in test_agent.get_last_messages()
Features: - Returns predefined responses (cycles through list) - Tracks call count and call history - Built-in assertion helpers - Compatible with Agent interface
QuickTest¶
One-liner tests for common patterns:
from agentflow.testing import QuickTest
# Single turn
result = await QuickTest.single_turn(
agent_response="Hello!",
user_message="Hi",
)
result.assert_contains("Hello!")
# Multi-turn conversation
result = await QuickTest.multi_turn(
[
("Hello", "Hi there!"),
("How are you?", "Great!"),
]
)
# With tools
result = await QuickTest.with_tools(
query="Weather in NYC?",
response="It's sunny!",
tools=["get_weather"],
)
result.assert_tool_called("get_weather")
TestContext¶
Isolated test environment with automatic cleanup:
from agentflow.testing import TestContext
with TestContext() as ctx:
# Get isolated container and store
graph = ctx.create_graph()
agent = ctx.create_test_agent(responses=["Test response"])
# Register mock tools
ctx.register_mock_tool("get_weather", lambda city: f"Sunny in {city}")
# ... run tests
# Automatic cleanup when exiting context
TestResult¶
Chainable assertion interface for fluent test writing:
result = await QuickTest.single_turn(
agent_response="The weather in NYC is sunny, 72°F",
user_message="Weather in NYC?",
)
# Chain assertions
(result
.assert_contains("sunny")
.assert_contains("NYC")
.assert_not_contains("error")
.assert_no_errors())
Available assertions:
- assert_contains(text) - Response contains text
- assert_not_contains(text) - Response doesn't contain text
- assert_equals(expected) - Exact match
- assert_tool_called(name, **args) - Tool was called
- assert_tool_not_called(name) - Tool was NOT called
- assert_message_count(n) - Number of messages
- assert_no_errors() - No error messages
Mock Tools¶
Test tool integrations without real APIs:
from agentflow.testing import MockToolRegistry, MockMCPClient
# Mock tool registry
tools = MockToolRegistry()
tools.register("get_weather", lambda city: f"Sunny in {city}")
# After test
assert tools.was_called("get_weather")
assert tools.call_count("get_weather") == 2
args = tools.last_call_args("get_weather")
# Mock MCP client
mock_mcp = MockMCPClient()
mock_mcp.add_tool(
name="mcp_weather",
description="Get weather",
parameters={"city": {"type": "string"}},
handler=lambda city: f"Weather in {city}: Sunny",
)
# Use in ToolNode
from agentflow.graph import ToolNode
tool_node = ToolNode([], client=mock_mcp)
Common Patterns¶
Testing Agent Responses¶
from agentflow.testing import TestAgent
from agentflow.graph import StateGraph
from agentflow.utils.constants import END
# Create test agent with multiple responses
agent = TestAgent(responses=["Response 1", "Response 2"])
graph = StateGraph()
graph.add_node("MAIN", agent)
graph.set_entry_point("MAIN")
graph.add_edge("MAIN", END)
compiled = graph.compile()
# First call
result1 = await compiled.ainvoke({"messages": [...]})
assert "Response 1" in result1["messages"][-1].text()
# Second call (cycles to next response)
result2 = await compiled.ainvoke({"messages": [...]})
assert "Response 2" in result2["messages"][-1].text()
# Verify call count
agent.assert_called_times(2)
Testing Tool Integration¶
from agentflow.testing import QuickTest
result = await QuickTest.with_tools(
query="What's the weather in Tokyo?",
response="It's sunny in Tokyo, 72°F",
tools=["get_weather"],
tool_responses={"get_weather": "Sunny, 72°F"},
)
result.assert_tool_called("get_weather")
result.assert_contains("sunny")
Testing Multi-Agent Systems¶
from agentflow.testing import TestAgent, TestContext
from agentflow.graph import StateGraph
from agentflow.utils.constants import END
with TestContext() as ctx:
graph = ctx.create_graph()
# Create multiple test agents
agent1 = ctx.create_test_agent(responses=["Response from Agent 1"])
agent2 = ctx.create_test_agent(responses=["Response from Agent 2"])
graph.add_node("AGENT1", agent1)
graph.add_node("AGENT2", agent2)
graph.set_entry_point("AGENT1")
graph.add_edge("AGENT1", "AGENT2")
graph.add_edge("AGENT2", END)
compiled = graph.compile()
result = await compiled.ainvoke({"messages": [...]})
# Verify both agents were called
agent1.assert_called()
agent2.assert_called()
Testing vs Evaluation¶
| Feature | Testing Module | Evaluation Module |
|---|---|---|
| Purpose | Unit/integration tests | Quality assurance |
| Speed | Fast (mocked LLMs) | Slower (real LLM calls) |
| Use Case | Development, CI/CD | Regression testing, validation |
| Tools | TestAgent, QuickTest | AgentEvaluator, QuickEval |
| Output | Pass/fail assertions | Detailed reports with scores |
Use testing for: - Fast unit tests during development - CI/CD pipelines - Testing code logic and graph structure - Mocking external dependencies
Use evaluation for: - Testing actual LLM behavior - Regression testing with real APIs - Quality benchmarking - Multi-criteria assessment
Installation¶
Testing utilities are included in the core AgentFlow package:
Documentation Guide¶
| Topic | Description |
|---|---|
| TestAgent | Mock agent for predictable testing |
| QuickTest | One-liner test patterns |
| TestContext | Isolated test environments |
| TestResult | Chainable assertions |
| Mock Tools | Mocking tool integrations |
Next Steps¶
- Start with TestAgent Guide
- Learn QuickTest Patterns
- Combine with Agent Evaluation for comprehensive testing