Memory

Source example: agentflow/examples/memory/simple_personalized_agent.py

What you will build

A simple personalized agent that remembers user preferences across turns by:

retrieving relevant memories before response generation
injecting those memories into the system prompt
storing the new interaction after the response is produced

Prerequisites

Python 3.11 or later
10xscale-agentflow installed
mem0ai installed
a Google model key such as GOOGLE_API_KEY
MEM0_API_KEY
Qdrant connection settings if you use the same vector-store setup as the example

Install:

pip install 10xscale-agentflow mem0ai python-dotenv

Recommended environment variables:

export GOOGLE_API_KEY=your_google_key
export MEM0_API_KEY=your_mem0_key
export QDRANT_URL=https://your-qdrant-cluster
export QDRANT_API_KEY=your_qdrant_key

Memory architecture

This example uses a three-step graph:

retrieve relevant memory
generate a response
persist the new interaction

Step 1 — Extend state with memory fields

The example adds user_id and memory_context to AgentState:

class MemoryAgentState(AgentState):
    user_id: str = ""
    memory_context: str = ""

memory_context becomes prompt-ready text after the retrieval step.

Step 2 — Configure Mem0

The example initializes Mem0 with:

Qdrant as the vector store
Gemini as the LLM
Gemini embeddings

config = {
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "collection_name": "simple_agent_memory",
            "url": os.getenv("QDRANT_URL"),
            "api_key": os.getenv("QDRANT_API_KEY"),
            "embedding_model_dims": 768,
        },
    },
    "llm": {
        "provider": "gemini",
        "config": {"model": "gemini-2.0-flash-exp", "temperature": 0.1},
    },
    "embedder": {"provider": "gemini", "config": {"model": "models/text-embedding-004"}},
}

self.memory = Memory.from_config(config)

Step 3 — Build a graph around memory retrieval and storage

The graph uses three nodes:

graph = StateGraph[MemoryAgentState](MemoryAgentState())

graph.add_node("memory_retrieval", self._memory_retrieval_node)
graph.add_node("chat", self.response_agent)
graph.add_node("memory_storage", self._memory_storage_node)

graph.set_entry_point("memory_retrieval")
graph.add_edge("memory_retrieval", "chat")
graph.add_edge("chat", "memory_storage")
graph.add_edge("memory_storage", END)

Unlike a ReAct loop, this graph is linear. Each turn always:

fetches context
responds
stores the interaction

Memory retrieval and storage flow

Step 4 — Retrieve relevant memories

The retrieval node searches memory by semantic similarity:

memory_results = self.memory.search(
    query=user_message,
    user_id=user_id,
    limit=3,
)

If results exist, it turns them into prompt text:

state.memory_context = "Relevant memories:\n" + "\n".join([f"- {m}" for m in memories])

That text is then interpolated into the system prompt.

Step 5 — Use memory in the system prompt

The Agent node includes {memory_context}:

system_prompt=[
    {
        "role": "system",
        "content": """You are a helpful AI assistant with memory of past conversations.

{memory_context}

Be conversational, helpful, and reference past interactions when relevant.""",
    },
]

This is the bridge between retrieved vector memory and model behavior.

Step 6 — Store the interaction

After the response is generated, the graph stores the conversation:

interaction = [
    {"role": "user", "content": user_message.content},
    {"role": "assistant", "content": ai_message.content},
]

self.memory.add(
    messages=interaction,
    user_id=state.user_id,
    metadata={"app_id": self.app_id},
)

Step 7 — Chat with a stable user ID

The chat method ties memory to a user:

config = {
    "thread_id": user_id,
    "recursion_limit": 10,
    "user_id": user_id,
}

The same user_id across turns allows later retrieval to find the prior memory.

Run the example

python agentflow/examples/memory/simple_personalized_agent.py

Expected behavior:

the first turn stores user preferences
later turns retrieve them and mention them naturally

Common mistakes

Changing user_id every turn and expecting shared memory.
Forgetting required external keys for Mem0 or Qdrant.
Confusing checkpointed short-term thread state with long-term semantic memory.
Expecting memory retrieval to be exact keyword matching rather than semantic search.

Key concepts

Concept	Details
`memory_retrieval` node	Reads relevant long-term memory before the model runs
`memory_storage` node	Persists the finished interaction after the model responds
`memory_context`	Prompt-ready text derived from semantic search results
`user_id`	Stable identity used for memory partitioning

What you learned

How to add long-term memory to an AgentFlow graph.
How to separate retrieval, response, and storage into different nodes.
How semantic memory becomes prompt context.

Next step

→ Qdrant Memory for a more advanced personalized agent with richer state and session handling.

What you will build​

Prerequisites​

Memory architecture​

Step 1 — Extend state with memory fields​

Step 2 — Configure Mem0​

Step 3 — Build a graph around memory retrieval and storage​

Memory retrieval and storage flow​

Step 4 — Retrieve relevant memories​

Step 5 — Use memory in the system prompt​

Step 6 — Store the interaction​

Step 7 — Chat with a stable user ID​

Run the example​

Common mistakes​

Key concepts​

What you learned​

Next step​