On this page
Agents Are the New Database Operator
Why agentic systems need a new database category.
Ragnor ComerfordThe operator defines the category
Database categories form around their dominant operator.
OLTP databases were built around trusted application code mutating current state. Warehouses were built around analysts and batch jobs scanning historical state. Search engines were built around ranked retrieval. Logs were built around append and replay.
Agents introduce a different operator profile. An agent reads, reasons, waits, resumes, delegates, writes, and sometimes gets things wrong. It may run for minutes, hours, or days. It may work in parallel with other agents on the same customer, incident, account, codebase, contract, or research question.
That changes three parts of the database workload.
The read side becomes context assembly. The write side becomes reviewable mutation. The shared-state side becomes coordination.
Context assembly
An agent context read is rarely a single lookup. It often starts from an entity or task, follows relationships, searches exact text, retrieves semantically similar examples, filters by type and permission, ranks evidence into a context window, and needs to know which version of the world it is reading from.
That workload has different failure modes from human search. A human can skim past a weak result. An agent may act on it. Missing a key policy is a recall failure. Including irrelevant evidence is a precision failure. Returning stale indexed data is a freshness failure. Hiding provenance is an audit failure. Returning data outside the actor's authority is a policy failure.
So retrieval quality becomes a database concern. The database has to compose graph traversal, full-text search, vector retrieval, scalar filters, permissions, provenance, and time. It also has to do this with interactive latency, because context reads sit inside agent loops. Heavy index maintenance can be asynchronous, but the freshness contract has to be explicit.
Reviewable mutation
The write side is harder.
A traditional application usually knows what it wants to write. An agent often derives a hypothesis, proposes a change, and needs review before that proposal becomes accepted state. Treating every agent output as a direct mutation of shared truth is the wrong default for many workflows.
Coordination
Coordination is the third pressure.
Multi-agent systems need more than private memory and tool calls. Agents need to see what has already been accepted, what is being proposed, who is working on which part of the graph, which state changed, and where their proposed work conflicts with someone else's. Some of that is control flow. But much of it is state. The database should coordinate agents at the state layer: branch heads, commits, events, diffs, actor identity, and merge status.
Old database questions, new answers
That brings the old database questions back into the center of the problem.
What commits atomically? What can run concurrently? What isolation level does a long-running task read from? What happens when two agents propose conflicting changes to the same account graph? How fresh are the vector and text indexes behind a context read? What state survives a crash? Can we reconstruct the exact state an agent saw when it acted? Can another agent subscribe to the change, inspect the diff, and continue from the same state?
These are database primitives: latency, concurrency, consistency, isolation, durability, recovery, cost, and observability. The agentic workload changes the answers.
A fallible writer needs proposed state. That points to branches. Parallel agents need isolation and conflict detection. That points to diffs and merges. Long-running agents need stable worlds to reason over. That points to snapshot reads and point-in-time queries. Multi-entity actions need atomicity across the graph, because half a graph mutation can be worse than a failed one. Coordinating agents need observable state transitions. That points to commits, change feeds, and evented graph state. Accountable systems need actor identity, provenance, and commit history.
A branchable context graph
The shape that falls out is a context graph — but a specific kind. The category of context graphs is real and is being argued for from several angles, mostly read-side: decision traces, process capture, semantic retrieval over enterprise reality. Our position is that the durable shape of the category is set by how it handles writes and coordination, not only how it serves reads. We call that shape a branchable operational graph: a context graph designed for the agent operator that needs to read, propose, coordinate, review, and merge changes to shared business state.
The graph matters because company context is relational: customers, people, tickets, documents, code, meetings, decisions, tasks, systems, policies, evidence, events, and proposed changes. The branch matters because agent work often starts as a proposal. The operational part matters because this state is used to coordinate action, not only to answer analytical questions after the fact.
Existing database categories cover important projections of this workload.
Vector databases solve semantic nearest-neighbor retrieval. Search engines solve ranked text retrieval. Graph databases solve relationship-local queries. OLTP databases solve trusted current-state mutation. Warehouses and lakehouses solve durable analytical history. Event logs solve append, replay, and fanout.
Agentic teams need pieces of those systems under one governance, consistency, and history model. Reads inform writes. Writes become future context. Commits become coordination events. Review needs diffs. Audit needs time.
Why this can exist now
This category also became practical because the substrate changed.
Object storage is now cheap durable shared memory for organizations. Open table formats made data ownership and engine plurality normal. Lakehouse systems made snapshots, history, schema evolution, and compute/storage separation familiar. Lance adds a useful base for multi-modal columnar data: structured fields, vectors, text, blobs, versions, and object-store-native layout. DataFusion gives us serious query execution machinery in Rust without rebuilding the relational engine from scratch.
The workload explains why the category is needed. The substrate explains why it can exist now.
Omnigraph
Omnigraph is our implementation of this shape.
It is a lakehouse-native, versioned knowledge substrate for agentic teams: a typed operational graph where humans and agents assemble context, coordinate through shared state and events, and safely merge proposed changes into accepted operational truth.
Concretely:
| Requirement | Omnigraph mechanic |
|---|---|
| Shared meaning | Typed schema / ontology |
| Relationship-local context | Graph model |
| Context assembly | Graph + vector + full-text in one runtime |
| Evidence beyond rows | Native blobs for files, images, and video |
| Safe agent writes | Branches |
| Reviewable mutation | Diff and merge |
| State-layer coordination | Commits, events, branch heads, merge status |
| Stable reasoning | Point-in-time queries |
| Multi-entity updates | Graph-level commits |
| Durable history | Lance / object-store-native storage |
| Query execution | DataFusion + Rust |
A typical workflow
A typical workflow looks like this.
An agent investigates a customer at risk. It starts from the customer node, walks the account graph, retrieves tickets, transcripts, product events, docs, prior decisions, and similar historical cases. Some evidence comes from exact filters. Some comes from full-text search. Some comes from vector retrieval. The useful result is a ranked, permission-safe, provenance-bearing context set, tied to a specific snapshot.
The agent then writes proposed changes on a branch: a new risk assessment, evidence links, suggested actions, affected owners, and follow-up tasks. That branch becomes visible state. Another agent can subscribe to the change event, inspect the proposed graph, add contract terms and recent usage patterns, or flag a conflict with an existing mitigation plan. A human reviews the diff, accepts part of it, rejects part of it, and merges the accepted state to main.
Later, the team can reconstruct what the agent saw, what it proposed, who reviewed it, what changed, which events fired, and which state became accepted. That is the difference between an agent transcript and an operational system of record.
Hard problems and the category boundary
There are hard problems here.
Semantic merge is harder than row merge. Index freshness needs explicit contracts. Provenance has to be modeled into the data, not added as a comment field after the fact. Policy should push into query planning so unauthorized context never reaches the model. Branches can create review overload if the diff quality is poor. Context quality needs measurement: recall, precision, freshness, permission safety, and evidence coverage. Coordination needs clear boundaries between database state and workflow control. Hot entities and graph partitioning become real systems problems as agent count grows.
Those problems are the category boundary.
A database starts to belong in this category when it makes the following native: hybrid context retrieval, typed operational graph state, branch-local proposed writes, graph-level commits, state-layer coordination, diff and merge, point-in-time reconstruction, provenance, actor-aware governance, and durable object-store history.
Reading, writing, and coordination as one loop
Agents need a database that treats reading, writing, and coordination as one loop. Context becomes action. Action becomes reviewed state. Reviewed state becomes coordination. Coordination produces future context.