Vector databases have become almost synonymous with LLM memory. That association is understandable. Semantic search is powerful, and embeddings underpin modern retrieval augmented generation. However, once we move beyond chatbots and into agentic systems that operate real workflows, the assumption that all memory must live in a vector store starts to unravel.
The underlying issue is simple but often overlooked. Memory and knowledge are different problems, and they benefit from very different storage and retrieval strategies.
Memory and Knowledge Solve Different Jobs
Knowledge retrieval focuses on finding relevant information by meaning, typically from unstructured sources such as policies, research papers, or historical conversations. Memory, in contrast, is about preserving facts, state, preferences, and outcomes over time so that future decisions are consistent, safe, and correct.
Most of what operational agents need to remember is highly structured. Examples include whether consent has been captured, which workflow step is currently active, which actions have already been taken, or which constraints apply to a specific user, patient, or case. These are not fuzzy concepts. They are deterministic facts.
That distinction becomes clearer when viewed side by side.
| Aspect | Knowledge Retrieval | Agent Memory |
|---|---|---|
| Primary goal | Find relevant text by meaning | Preserve facts and state over time |
| Data type | Unstructured or semi structured text | Structured or strongly typed data |
| Typical queries | “Find similar guidance” | “What is the current status?” |
| Tolerance for approximation | High | Very low |
| Compliance suitability | Limited | High |
When an agent needs to know whether a patient has been transferred to theatre, or whether a customer has already been contacted, cosine similarity is not just unnecessary. It introduces risk. In these cases the correct operation is a primary key lookup or a filtered query, not a probabilistic match.
Where Vector Stores Actually Add Value
Vector databases excel when an agent needs to reason over large volumes of narrative content. Common examples include searching clinical notes, locating relevant policies, recalling earlier conversations by topic, or discovering related incidents across a corpus.
Problems emerge when teams attempt to use vector stores as a general purpose memory layer. This often leads to higher cost, increased latency, weaker explainability, and more complex data governance, without improving decision quality for most operational tasks.
In mature agent architectures, vector memory is therefore complementary rather than foundational. The dominant pattern is hybrid memory, where structured systems hold authoritative state and vector systems provide contextual enrichment when semantic reasoning is genuinely required.
What Usually Works Better Than Embeddings for Memory
For most long term agent memory, conventional data stores are not only sufficient, they are preferable.
| Storage type | Why it works well for memory | Typical uses in agents |
|---|---|---|
| Relational databases (Postgres, MySQL) | Strong consistency, transactions, joins, auditability | Profiles, workflow state, approvals, patient records |
| Document databases (MongoDB, DynamoDB) | Flexible schemas, JSON native storage | Session memory, evolving agent state models |
| Key value stores (Redis) | Extremely fast access, ideal for checkpoints | Conversation state, resumable workflows |
| Vector databases | Semantic similarity over text | Notes, policies, summaries, long narratives |
Relational databases are particularly well suited to operational agents because they align with how enterprises already manage critical data. They support transactional safety, deterministic queries, and established governance models, which are essential in regulated environments such as healthcare and finance.
Document databases are useful where agent memory structures evolve rapidly or differ by agent type. Key value stores are commonly used for checkpointing long running workflows, allowing execution to pause and resume safely without reconstructing state.
None of these require embeddings. In many cases, embeddings actively complicate the design.
How Agents Actually Retrieve Memory
Modern agent frameworks do not treat memory as passive context that is blindly injected into prompts. Instead, memory is typically accessed through explicit retrieval steps in the agent graph, often implemented as tools or dedicated nodes.
This allows the agent to reason about whether memory is required, which memory is relevant, and how that information should influence the next action. Retrieval may occur before execution begins to hydrate the graph with known state, or during execution when a decision requires additional context.
From the agent’s perspective, retrieving memory via a SQL query, an API call, or a key value lookup is no different from calling a vector search endpoint. What matters is that the retrieval mechanism matches the nature of the data and the decision being made.
For most operational decisions, deterministic queries are faster, cheaper, easier to explain, and easier to govern.
Why This Matters in Regulated and Operational Domains
In healthcare, public sector, and financial services, memory must be traceable, correct, and explainable. If an agent recommends an action, it must be possible to demonstrate which facts were used and why.
Vector similarity can suggest that two documents are related, but it cannot establish causality or guarantee factual correctness. That makes it unsuitable as the primary store of authoritative memory in environments where auditability is mandatory.
This is why production agent architectures increasingly separate concerns. Structured systems manage truth and state, while vector systems provide supporting context when narrative reasoning is required. This approach also aligns with emerging interoperability patterns such as the Model Context Protocol, where memory and tools are externalised as services rather than embedded inside prompts.
TLDR;
Vector databases are excellent for semantic knowledge retrieval, but they are not a universal memory layer for agents.
If an agent needs to remember facts, preferences, workflow state, or prior decisions, conventional databases are almost always the right starting point. They are cheaper, faster, easier to govern, and better aligned with enterprise and clinical operating models.
Vector memory should be added deliberately, and only where semantic recall over unstructured text genuinely improves reasoning. In well designed agent systems, memory is treated as data first and embeddings second.
References
LangGraph Persistence and Checkpointing Concepts — LangChain
https://langchain-ai.github.io/langgraph/concepts/persistence/
LangChain RAG and Vector Retrieval Patterns
https://python.langchain.com/docs/use_cases/question_answering/
Model Context Protocol (MCP) — Externalised Context and Tools
https://modelcontextprotocol.io/
Anthropic Research: Agent Architectures and Tool Use
https://www.anthropic.com/engineering/building-effective-agents




