Skip to main content

GenAI Application Release Checklist

Use this checklist before deploying any GenAI application to production. It covers evaluation, safety, cost, monitoring, and documentation.


Pre-Release Evaluation

Functional Requirements

ItemStatusNotes
[ ] All golden dataset tests pass (≥90%)
[ ] Schema compliance: 100%
[ ] Safety refusal rate: 100% for test cases
[ ] LLM-as-Judge average score ≥4/5
[ ] Latency p95 < target (e.g., 3 seconds)
[ ] Error rate < target (e.g., 1%)

Regression Testing

ItemStatusNotes
[ ] Core functionality tests pass
[ ] Tool integrations work correctly
[ ] State/memory persistence works
[ ] Streaming responses work
[ ] Structured outputs validate correctly

Safety and Security

Input Validation

ItemStatusNotes
[ ] Input length validation
[ ] Prompt injection detection
[ ] PII scrubbing (if required)
[ ] Rate limiting configured
[ ] File upload validation (type, size)

Output Safety

ItemStatusNotes
[ ] PII filtering implemented
[ ] Content moderation (if needed)
[ ] Output length limits enforced
[ ] Schema validation on all outputs

Access Control

ItemStatusNotes
[ ] Authentication required
[ ] Authorization levels configured
[ ] API keys/secrets secured
[ ] Audit logging enabled

Cost Management

Cost Controls

ItemStatusNotes
[ ] Cost per request estimated
[ ] Daily budget alert set
[ ] Monthly budget limit configured
[ ] Token usage logging
[ ] Model routing configured (if applicable)

Cost Estimates

MetricEstimateTarget
Cost per 1000 requests
Daily request volume
Monthly projected cost
p95 latency

Monitoring and Observability

Logging

ItemStatusNotes
[ ] Request logging enabled
[ ] Error logging with stack traces
[ ] Tool call logging
[ ] Latency tracking

Metrics

ItemStatusNotes
[ ] Request count
[ ] Error rate
[ ] Latency p50, p95, p99
[ ] Token usage
[ ] Quality metrics (if tracked)

Alerts

ItemStatusNotes
[ ] Error rate alert (> threshold)
[ ] Latency alert (> p95 threshold)
[ ] Cost alert (daily budget)
[ ] Health check endpoint

Documentation

API Documentation

ItemStatusNotes
[ ] API endpoints documented
[ ] Request/response schemas documented
[ ] Error codes documented
[ ] Authentication documented

User Documentation

ItemStatusNotes
[ ] User guide created
[ ] Examples provided
[ ] Limitations documented
[ ] Support contacts provided

Operations Documentation

ItemStatusNotes
[ ] Runbook created
[ ] Deployment instructions
[ ] Rollback procedures tested
[ ] On-call guide created

Deployment

Infrastructure

ItemStatusNotes
[ ] Environment variables configured
[ ] Secrets secured (not in code)
[ ] Health checks configured
[ ] Graceful shutdown implemented
[ ] Containerization (if applicable)

Testing in Staging

ItemStatusNotes
[ ] Staging deployment successful
[ ] Smoke tests pass
[ ] Load tests pass
[ ] Integration tests pass

Sign-off

RoleNameDateSignature
Engineering Lead
Product Owner
Security Review
QA Sign-off

Quick Summary

Before shipping, verify:

  1. Eval Pass — Quality tests green
  2. Safety — No PII leaks, no harmful outputs
  3. Cost — Budget alerts configured
  4. Monitoring — Logs, metrics, alerts working
  5. Docs — API docs, user guide, runbook complete