ID Generation Guide¶
This guide covers using the Snowflake ID generator for generating unique, distributed, and time-sortable IDs in your AgentFlow application.
Table of Contents¶
Overview¶
AgentFlow includes a Snowflake ID generator based on Twitter's Snowflake algorithm for generating unique, distributed, time-sortable 64-bit IDs.
Key Features¶
- ✅ Unique: Guaranteed unique across distributed systems
- ✅ Time-sortable: IDs are roughly chronological
- ✅ High performance: Can generate thousands of IDs per second
- ✅ Distributed: Works across multiple nodes and workers
- ✅ 64-bit integers: Efficient storage and indexing
- ✅ Configurable: Adjust bit allocation for your needs
What is Snowflake ID?¶
A Snowflake ID is a 64-bit integer composed of:
|-------|-----------|--------|--------|----------|
| Sign | Time | Node | Worker | Sequence |
| 1 | 39 | 5 | 8 | 11 |
|-------|-----------|--------|--------|----------|
Default Bit Allocation¶
| Component | Bits | Range | Description |
|---|---|---|---|
| Sign | 1 | 0 | Always 0 (positive) |
| Time | 39 | 0 - 17.4 years | Milliseconds since epoch |
| Node | 5 | 0 - 31 | Node/datacenter ID |
| Worker | 8 | 0 - 255 | Worker/process ID |
| Sequence | 11 | 0 - 4095 | Per-millisecond counter |
Example ID¶
ID: 1234567890123456789
Breakdown:
- Time: 1609459200000 (Jan 1, 2021 00:00:00 UTC + offset)
- Node ID: 5
- Worker ID: 3
- Sequence: 42
Advantages¶
- Distributed Generation: No coordination needed between nodes
- Time Ordering: IDs generated later have higher values
- Database Friendly: 64-bit integers are efficiently indexed
- High Throughput: Up to 4096 IDs per millisecond per worker
- No Lookups: No need to query a database or service
Installation¶
Required Package¶
Or install with agentflow-cli:
Verify Installation¶
from agentflow_cli import SnowFlakeIdGenerator
# This will raise ImportError if snowflakekit is not installed
generator = SnowFlakeIdGenerator()
Basic Usage¶
Import¶
Create Generator and Use with StateGraph¶
# Create generator (reads configuration from environment variables)
id_generator = SnowFlakeIdGenerator()
# Use with StateGraph
graph = StateGraph[MyAgentState](MyAgentState(), id_generator=id_generator)
The generator will automatically read configuration from environment variables (recommended for production).
Configuration¶
Environment Variables¶
Set these in your .env file:
# Required
SNOWFLAKE_EPOCH=1609459200000 # Milliseconds since Unix epoch
# Node and Worker IDs (required)
SNOWFLAKE_NODE_ID=1 # 0-31 (with 5 bits)
SNOWFLAKE_WORKER_ID=1 # 0-255 (with 8 bits)
# Optional (defaults shown)
SNOWFLAKE_TOTAL_BITS=64
SNOWFLAKE_TIME_BITS=39
SNOWFLAKE_NODE_BITS=5
SNOWFLAKE_WORKER_BITS=8
Choosing an Epoch¶
The epoch is the starting point for time measurement. Choose a date close to your service launch:
from datetime import datetime
# Calculate epoch in milliseconds
epoch_date = datetime(2024, 1, 1, 0, 0, 0)
epoch_ms = int(epoch_date.timestamp() * 1000)
print(f"SNOWFLAKE_EPOCH={epoch_ms}")
# Output: SNOWFLAKE_EPOCH=1704067200000
Why choose a custom epoch? - Extends the time range (default 39 bits = ~17.4 years from epoch) - If epoch = Jan 1, 2024, you can generate IDs until ~2041
Node and Worker IDs¶
Assign unique IDs across your infrastructure:
# Production setup
# Server 1
SNOWFLAKE_NODE_ID=1
SNOWFLAKE_WORKER_ID=1
# Server 2
SNOWFLAKE_NODE_ID=1
SNOWFLAKE_WORKER_ID=2
# Server 3 (different datacenter)
SNOWFLAKE_NODE_ID=2
SNOWFLAKE_WORKER_ID=1
Bit Allocation¶
Customize bit allocation for your use case:
Default (total 64 bits):
SNOWFLAKE_TIME_BITS=39 # ~17 years
SNOWFLAKE_NODE_BITS=5 # 32 nodes
SNOWFLAKE_WORKER_BITS=8 # 256 workers per node
# Sequence bits = 64 - 1 - 39 - 5 - 8 = 11 bits = 4096 IDs/ms
High concurrency (fewer nodes, more throughput):
SNOWFLAKE_TIME_BITS=39 # ~17 years
SNOWFLAKE_NODE_BITS=3 # 8 nodes
SNOWFLAKE_WORKER_BITS=6 # 64 workers per node
# Sequence bits = 15 bits = 32768 IDs/ms
Many nodes (distributed):
SNOWFLAKE_TIME_BITS=39 # ~17 years
SNOWFLAKE_NODE_BITS=8 # 256 nodes
SNOWFLAKE_WORKER_BITS=5 # 32 workers per node
# Sequence bits = 11 bits = 4096 IDs/ms
Long time range:
SNOWFLAKE_TIME_BITS=41 # ~69 years
SNOWFLAKE_NODE_BITS=4 # 16 nodes
SNOWFLAKE_WORKER_BITS=7 # 128 workers per node
# Sequence bits = 11 bits = 4096 IDs/ms
Validation¶
Bit allocation must follow these rules:
- Total must equal 64:
1 + time + node + worker + sequence = 64 - All components must be positive
- Node ID must be < 2^node_bits
- Worker ID must be < 2^worker_bits
Troubleshooting¶
ImportError: No module named 'snowflakekit'¶
Solution:
ValueError: All configuration parameters must be provided¶
Solution: Set all required environment variables:
Duplicate IDs Generated¶
Possible causes: 1. Same NODE_ID and WORKER_ID on multiple servers 2. System clock went backwards 3. Generating IDs faster than supported (>4096/ms)
Solutions: - Ensure unique NODE_ID/WORKER_ID combinations per server instance - Use NTP to keep clocks synchronized - Increase sequence bits if higher throughput is needed
Additional Resources¶
- Twitter Snowflake - Original Snowflake algorithm
- Configuration Guide - Complete configuration reference
- Deployment Guide - Production deployment strategies