Run the API server
The agentflow api command starts a FastAPI-based REST server that loads your compiled graph and exposes it over HTTP. This guide covers common scenarios from quick local testing to production deployment.
Prerequisites
You must have agentflow.json and a valid graph module in your project:
# Verify the config file exists and is valid
cat agentflow.json
# Verify your graph module can be imported
python -c "from graph.react import app; print(app)"
Both commands should succeed without errors.
Quick start (development)
From the folder that contains agentflow.json:
agentflow api --host 127.0.0.1 --port 8000
This starts the server on http://127.0.0.1:8000. Auto-reload is enabled by default. The server restarts automatically when you edit any Python file, which is useful during development.
What the flags mean:
--host 127.0.0.1— Bind only to localhost (only accessible from your machine). Use instead--host 0.0.0.0to accept all network interfaces.--port 8000— Listen on port 8000. Change to any available port (8001, 8080, etc.).
Verify it is running
From another terminal, ping the server to confirm it is reachable:
curl http://127.0.0.1:8000/ping
Expected successful response:
{"success": true, "data": "pong"}
This endpoint requires no authentication and is commonly used for load balancer health checks.
Interactive API documentation
When running locally, the server exposes interactive API docs:
- Swagger UI (recommended):
http://127.0.0.1:8000/docs - ReDoc (alternative):
http://127.0.0.1:8000/redoc
You can test endpoints directly from these interfaces without writing curl commands. This is the fastest way to understand the API surface.
Example: Invoking the graph
- Open
http://127.0.0.1:8000/docs - Find the
POST /v1/graph/invokeendpoint - Click "Try it out"
- Provide sample input:
{"messages": [{"role": "user", "content": "Hello"}], "config": {"thread_id": "test"}} - Click "Execute" and see the response
Port already in use
If you get Address already in use, choose a different port:
agentflow api --port 8001
Or find and kill the process holding the port:
lsof -ti :8000 | xargs kill -9
Environment variables
Your graph may need environment variables (API keys, database URLs, etc.). Load them from a .env file via agentflow.json:
{
"env": ".env"
}
Or pass them directly to the server:
export GOOGLE_API_KEY=your_key
export REDIS_URL=redis://localhost:6379
agentflow api
Use a different config file
For multiple environments (dev, staging, prod), keep separate config files:
config/
dev.json
staging.json
prod.json
Start the server with a specific config:
agentflow api --config config/staging.json
Each config can point to different checkpointers, stores, and authentication backends.
Development mode with auto-reload
Auto-reload (the default) is enabled for development:
agentflow api --host 127.0.0.1 --port 8000 --reload
This is useful when iterating on your graph. Every time you save a Python file, the server restarts. Disable auto-reload with:
agentflow api --host 127.0.0.1 --port 8000 --no-reload
Warning: Auto-reload in containers or over network file systems is unreliable. Always use --no-reload in production and in Docker.
Production mode
For a production deployment, disable auto-reload and use a process supervisor:
MODE=production agentflow api --no-reload --host 0.0.0.0 --port 8000
What this does:
MODE=production— Sets the environment to production mode (disables verbose output, enables certain optimizations).--no-reload— Disables file-watching, which is expensive and unreliable in production.--host 0.0.0.0— Binds to all network interfaces so the server is accessible from outside the host.
Process management
Use one of:
systemd:
[Unit]
Description=AgentFlow API
After=network.target
[Service]
Type=simple
User=agentflow
WorkingDirectory=/opt/agentflow
ExecStart=/usr/bin/agentflow api --no-reload
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target
Enable and start:
sudo systemctl enable agentflow
sudo systemctl start agentflow
sudo systemctl status agentflow
Docker: See Generate Docker Files.
PM2 (Node process manager, works with Python via shell):
npm install -g pm2
pm2 start "agentflow api --no-reload" --name "agentflow-api"
pm2 save # Save to auto-start on reboot
Multi-worker deployment
For high throughput, run multiple API instances behind a load balancer:
# Terminal 1
agentflow api --port 8001 --no-reload
# Terminal 2
agentflow api --port 8002 --no-reload
# Terminal 3
agentflow api --port 8003 --no-reload
Use a reverse proxy (nginx, HAProxy) or Docker Compose to load-balance across these ports. Important: Ensure all instances use the same checkpointer (e.g., PgCheckpointer backed by a shared database) so state is consistent.
Verbose logging
For debugging, enable verbose output:
agentflow api --verbose
This prints:
- Request details (path, method, headers)
- Graph node execution times
- Checkpointer read/write operations
- State transitions
Useful for troubleshooting slow requests or state issues.
Quiet mode
Suppress all output except errors:
agentflow api --quiet
Useful in Docker containers where reducing log volume is important.
Disable API documentation in production
The interactive API docs expose your endpoint structure. Disable them in production:
export DOCS_PATH=""
export REDOC_PATH=""
agentflow api --no-reload
With these environment variables set, /docs and /redoc return 404.
Monitoring and metrics
The /ping endpoint is always available (without authentication). Use it for health checks:
# Liveness check (is the server responding?)
curl -f http://127.0.0.1:8000/ping || exit 1
Incorporate this into your monitoring (Prometheus, Datadog, etc.).
Performance tuning
Connection pooling: If your graph connects to databases or external APIs, configure connection pools in your graph initialization. Look for settings like pool_size, max_overflow, pool_timeout.
Request timeout: By default, graph invocations have a 30-second timeout. Increase if your graph does long-running tasks:
# In your graph module
app = graph.compile(recursion_limit=100) # Increase iteration limit if needed
Hardware: More CPU cores help if your graph does heavy computation. More memory helps if you store large objects in state.
Graceful shutdown
The server handles SIGTERM gracefully:
# In one terminal
agentflow api
# In another terminal, after a delay
kill -TERM <pid>
The server will:
- Stop accepting new requests
- Wait for in-flight requests to complete (up to a timeout)
- Close connections and exit
Common issues
Graph import fails: "ModuleNotFoundError"
- Verify the import path in
agentflow.jsonis correct. - Verify the module is installed or on the Python path.
- Try importing manually:
python -c "from graph.react import app"
Port 8000 is already in use
- Use a different port:
agentflow api --port 8001 - Or find and kill the process:
lsof -ti :8000 | xargs kill -9
"GOOGLE_API_KEY" environment variable not set
- Set it before starting the server:
export GOOGLE_API_KEY=... - Or add it to your
.envfile and ensureagentflow.jsonreferences it:"env": ".env"
Requests are very slow
- Check server logs with
--verbose - Verify your graph logic (does a tool call take a long time?)
- Check database/network connections if your graph connects externally
"Connection refused" when trying to reach the server
- Is the server running? Check the terminal where you started it.
- Is the host/port correct? Try
curl http://127.0.0.1:8000/ping - If using a VM or container, verify networking is properly configured.