Vector databases have become almost synonymous with LLM memory. That association is understandable. Semantic search is powerful, and embeddings underpin modern retrieval augmented generation. However, once we move beyond chatbots and into agentic systems that operate real workflows, the assumption that all memory must live in a vector store starts to unravel.

The underlying issue is simple but often overlooked. Memory and knowledge are different problems, and they benefit from very different storage and retrieval strategies.

Memory and Knowledge Solve Different Jobs

Knowledge retrieval focuses on finding relevant information by meaning, typically from unstructured sources such as policies, research papers, or historical conversations. Memory, in contrast, is about preserving facts, state, preferences, and outcomes over time so that future decisions are consistent, safe, and correct.

“We reached for embeddings out of habit, not because the agent actually needed them.”

Most of what operational agents need to remember is highly structured. Examples include whether consent has been captured, which workflow step is currently active, which actions have already been taken, or which constraints apply to a specific user, patient, or case. These are not fuzzy concepts. They are deterministic facts.

That distinction becomes clearer when viewed side by side.

Aspect	Knowledge Retrieval	Agent Memory
Primary goal	Find relevant text by meaning	Preserve facts and state over time
Data type	Unstructured or semi structured text	Structured or strongly typed data
Typical queries	“Find similar guidance”	“What is the current status?”
Tolerance for approximation	High	Very low
Compliance suitability	Limited	High

When an agent needs to know whether a patient has been transferred to theatre, or whether a customer has already been contacted, cosine similarity is not just unnecessary. It introduces risk. In these cases the correct operation is a primary key lookup or a filtered query, not a probabilistic match.

Where Vector Stores Actually Add Value

Vector databases excel when an agent needs to reason over large volumes of narrative content. Common examples include searching clinical notes, locating relevant policies, recalling earlier conversations by topic, or discovering related incidents across a corpus.

Problems emerge when teams attempt to use vector stores as a general purpose memory layer. This often leads to higher cost, increased latency, weaker explainability, and more complex data governance, without improving decision quality for most operational tasks.

In mature agent architectures, vector memory is therefore complementary rather than foundational. The dominant pattern is hybrid memory, where structured systems hold authoritative state and vector systems provide contextual enrichment when semantic reasoning is genuinely required.

What Usually Works Better Than Embeddings for Memory

For most long term agent memory, conventional data stores are not only sufficient, they are preferable.

Storage type	Why it works well for memory	Typical uses in agents
Relational databases (Postgres, MySQL)	Strong consistency, transactions, joins, auditability	Profiles, workflow state, approvals, patient records
Document databases (MongoDB, DynamoDB)	Flexible schemas, JSON native storage	Session memory, evolving agent state models
Key value stores (Redis)	Extremely fast access, ideal for checkpoints	Conversation state, resumable workflows
Vector databases	Semantic similarity over text	Notes, policies, summaries, long narratives

Relational databases are particularly well suited to operational agents because they align with how enterprises already manage critical data. They support transactional safety, deterministic queries, and established governance models, which are essential in regulated environments such as healthcare and finance.

Document databases are useful where agent memory structures evolve rapidly or differ by agent type. Key value stores are commonly used for checkpointing long running workflows, allowing execution to pause and resume safely without reconstructing state.

None of these require embeddings. In many cases, embeddings actively complicate the design.

How Agents Actually Retrieve Memory

Modern agent frameworks do not treat memory as passive context that is blindly injected into prompts. Instead, memory is typically accessed through explicit retrieval steps in the agent graph, often implemented as tools or dedicated nodes.

This allows the agent to reason about whether memory is required, which memory is relevant, and how that information should influence the next action. Retrieval may occur before execution begins to hydrate the graph with known state, or during execution when a decision requires additional context.

From the agent’s perspective, retrieving memory via a SQL query, an API call, or a key value lookup is no different from calling a vector search endpoint. What matters is that the retrieval mechanism matches the nature of the data and the decision being made.

For most operational decisions, deterministic queries are faster, cheaper, easier to explain, and easier to govern.

Why This Matters in Regulated and Operational Domains

In healthcare, public sector, and financial services, memory must be traceable, correct, and explainable. If an agent recommends an action, it must be possible to demonstrate which facts were used and why.

Vector similarity can suggest that two documents are related, but it cannot establish causality or guarantee factual correctness. That makes it unsuitable as the primary store of authoritative memory in environments where auditability is mandatory.

This is why production agent architectures increasingly separate concerns. Structured systems manage truth and state, while vector systems provide supporting context when narrative reasoning is required. This approach also aligns with emerging interoperability patterns such as the Model Context Protocol, where memory and tools are externalised as services rather than embedded inside prompts.

TLDR;

Vector databases are excellent for semantic knowledge retrieval, but they are not a universal memory layer for agents.

If an agent needs to remember facts, preferences, workflow state, or prior decisions, conventional databases are almost always the right starting point. They are cheaper, faster, easier to govern, and better aligned with enterprise and clinical operating models.

Vector memory should be added deliberately, and only where semantic recall over unstructured text genuinely improves reasoning. In well designed agent systems, memory is treated as data first and embeddings second.

References

LangGraph Persistence and Checkpointing Concepts — LangChain
https://langchain-ai.github.io/langgraph/concepts/persistence/

LangChain RAG and Vector Retrieval Patterns
https://python.langchain.com/docs/use_cases/question_answering/

Model Context Protocol (MCP) — Externalised Context and Tools
https://modelcontextprotocol.io/

Anthropic Research: Agent Architectures and Tool Use
https://www.anthropic.com/engineering/building-effective-agents

About the Author

Peter Wood

Healthcare technology leader specialising in data platforms, operational intelligence, and agent-driven automation. Peter has led large-scale digital transformation programmes with major hospital groups and global technology partners, translating advanced analytics and AI into measurable improvements in clinical operations, capacity, and patient flow.

Share on LinkedIn ↗← Back to All Articles

You Don’t Need a Vector Store for Agent Memory in Most Agent Architectures