ID Generation Guide¶

This guide covers using the Snowflake ID generator for generating unique, distributed, and time-sortable IDs in your AgentFlow application.

Table of Contents¶

Overview
What is Snowflake ID?
Installation
Basic Usage
Configuration
Best Practices

Overview¶

AgentFlow includes a Snowflake ID generator based on Twitter's Snowflake algorithm for generating unique, distributed, time-sortable 64-bit IDs.

Key Features¶

✅ Unique: Guaranteed unique across distributed systems
✅ Time-sortable: IDs are roughly chronological
✅ High performance: Can generate thousands of IDs per second
✅ Distributed: Works across multiple nodes and workers
✅ 64-bit integers: Efficient storage and indexing
✅ Configurable: Adjust bit allocation for your needs

What is Snowflake ID?¶

A Snowflake ID is a 64-bit integer composed of:

|-------|-----------|--------|--------|----------|
|  Sign |   Time    |  Node  | Worker | Sequence |
|   1   |    39     |   5    |   8    |    11    |
|-------|-----------|--------|--------|----------|

Default Bit Allocation¶

Component	Bits	Range	Description
Sign	1	0	Always 0 (positive)
Time	39	0 - 17.4 years	Milliseconds since epoch
Node	5	0 - 31	Node/datacenter ID
Worker	8	0 - 255	Worker/process ID
Sequence	11	0 - 4095	Per-millisecond counter

Example ID¶

ID: 1234567890123456789

Breakdown:
- Time: 1609459200000 (Jan 1, 2021 00:00:00 UTC + offset)
- Node ID: 5
- Worker ID: 3
- Sequence: 42

Advantages¶

Distributed Generation: No coordination needed between nodes
Time Ordering: IDs generated later have higher values
Database Friendly: 64-bit integers are efficiently indexed
High Throughput: Up to 4096 IDs per millisecond per worker
No Lookups: No need to query a database or service

Installation¶

Required Package¶

pip install snowflakekit

Or install with agentflow-cli:

pip install "10xscale-agentflow-cli[snowflakekit]"

Verify Installation¶

from agentflow_cli import SnowFlakeIdGenerator

# This will raise ImportError if snowflakekit is not installed
generator = SnowFlakeIdGenerator()

Basic Usage¶

Import¶

from agentflow_cli import SnowFlakeIdGenerator
from agentflow.graph import StateGraph

Create Generator and Use with StateGraph¶

# Create generator (reads configuration from environment variables)
id_generator = SnowFlakeIdGenerator()

# Use with StateGraph
graph = StateGraph[MyAgentState](MyAgentState(), id_generator=id_generator)

The generator will automatically read configuration from environment variables (recommended for production).

Configuration¶

Environment Variables¶

Set these in your .env file:

# Required
SNOWFLAKE_EPOCH=1609459200000  # Milliseconds since Unix epoch

# Node and Worker IDs (required)
SNOWFLAKE_NODE_ID=1            # 0-31 (with 5 bits)
SNOWFLAKE_WORKER_ID=1          # 0-255 (with 8 bits)

# Optional (defaults shown)
SNOWFLAKE_TOTAL_BITS=64
SNOWFLAKE_TIME_BITS=39
SNOWFLAKE_NODE_BITS=5
SNOWFLAKE_WORKER_BITS=8

Choosing an Epoch¶

The epoch is the starting point for time measurement. Choose a date close to your service launch:

from datetime import datetime

# Calculate epoch in milliseconds
epoch_date = datetime(2024, 1, 1, 0, 0, 0)
epoch_ms = int(epoch_date.timestamp() * 1000)
print(f"SNOWFLAKE_EPOCH={epoch_ms}")

# Output: SNOWFLAKE_EPOCH=1704067200000

Why choose a custom epoch? - Extends the time range (default 39 bits = ~17.4 years from epoch) - If epoch = Jan 1, 2024, you can generate IDs until ~2041

Node and Worker IDs¶

Assign unique IDs across your infrastructure:

# Production setup
# Server 1
SNOWFLAKE_NODE_ID=1
SNOWFLAKE_WORKER_ID=1

# Server 2
SNOWFLAKE_NODE_ID=1
SNOWFLAKE_WORKER_ID=2

# Server 3 (different datacenter)
SNOWFLAKE_NODE_ID=2
SNOWFLAKE_WORKER_ID=1

Bit Allocation¶

Customize bit allocation for your use case:

Default (total 64 bits):

SNOWFLAKE_TIME_BITS=39     # ~17 years
SNOWFLAKE_NODE_BITS=5      # 32 nodes
SNOWFLAKE_WORKER_BITS=8    # 256 workers per node
# Sequence bits = 64 - 1 - 39 - 5 - 8 = 11 bits = 4096 IDs/ms

High concurrency (fewer nodes, more throughput):

SNOWFLAKE_TIME_BITS=39     # ~17 years
SNOWFLAKE_NODE_BITS=3      # 8 nodes
SNOWFLAKE_WORKER_BITS=6    # 64 workers per node
# Sequence bits = 15 bits = 32768 IDs/ms

Many nodes (distributed):

SNOWFLAKE_TIME_BITS=39     # ~17 years
SNOWFLAKE_NODE_BITS=8      # 256 nodes
SNOWFLAKE_WORKER_BITS=5    # 32 workers per node
# Sequence bits = 11 bits = 4096 IDs/ms

Long time range:

SNOWFLAKE_TIME_BITS=41     # ~69 years
SNOWFLAKE_NODE_BITS=4      # 16 nodes
SNOWFLAKE_WORKER_BITS=7    # 128 workers per node
# Sequence bits = 11 bits = 4096 IDs/ms

Validation¶

Bit allocation must follow these rules:

Total must equal 64: 1 + time + node + worker + sequence = 64
All components must be positive
Node ID must be < 2^node_bits
Worker ID must be < 2^worker_bits

Troubleshooting¶

ImportError: No module named 'snowflakekit'¶

Solution:

pip install snowflakekit

ValueError: All configuration parameters must be provided¶

Solution: Set all required environment variables:

# .env
SNOWFLAKE_EPOCH=1704067200000
SNOWFLAKE_NODE_ID=1
SNOWFLAKE_WORKER_ID=1

Duplicate IDs Generated¶

Possible causes: 1. Same NODE_ID and WORKER_ID on multiple servers 2. System clock went backwards 3. Generating IDs faster than supported (>4096/ms)

Solutions: - Ensure unique NODE_ID/WORKER_ID combinations per server instance - Use NTP to keep clocks synchronized - Increase sequence bits if higher throughput is needed

Additional Resources¶

Twitter Snowflake - Original Snowflake algorithm
Configuration Guide - Complete configuration reference
Deployment Guide - Production deployment strategies