# Why the SOC graph database keeps losing

**Category:** Architecture · **Author:** D. Halevy · **Date:** 2026-05-17 · **Reading time:** 6 min · **Tags:** architecture, graph db, investigation, agents

> *Hidden draft. Not yet published.*

Every five years, a security startup decides the answer is a graph database. The pitch is always the same: alerts are entities, the entities have relationships, the relationships matter more than the entities, ergo, graph. Uplevel did it. Graphistry did it. Hunters did it. Anvilogic ships one. There's a long tail of seed-stage companies on the AI SOC map doing it again right now.

The graph database is not winning. It's been arriving for a decade and the SOC has consistently rejected it. I want to write down why, because the same pattern is about to repeat with agentic SOC products, and the temptation to reach for the graph is going to be strongest exactly when you should resist it.

## What the graph promises

A persistent enterprise graph stores every entity (user, host, process, IP, file, hash, account, role, asset) and every observed relationship between them, accumulated over time. Run a detection by traversing it. Run an investigation by querying it. Run hunting by pattern-matching subgraphs.

It's beautiful on a whiteboard. The relationships *do* matter. Lateral movement is a graph traversal. Privilege escalation is a graph traversal. Living-off-the-land detection is a subgraph match.

Everyone who has tried to operate one in production has hit the same three walls.

## Wall one: the schema is wrong by Tuesday

The graph's value depends on a stable schema for entities and relationships. SOC data has no stable schema. EDR vendors rename fields between minor releases. Identity providers add scopes. New SaaS apps show up with new asset types twice a quarter. Cloud providers refactor the relationship between IAM principals and resources every time they ship a new product.

A SIEM survives this because it doesn't model anything — it stores logs and the model lives in the queries. A graph database can't survive it because the model *is* the database. Migrating relationships across a billion-edge graph every time Okta ships a new scope type is not a maintenance burden anyone budgeted for.

## Wall two: relationships rot

The graph treats edges as facts. Most SOC edges are not facts. They are observations with a half-life.

User X accessed Server Y is true on Tuesday. It is misleading on Friday and dangerous on the following Tuesday, because the access was a one-off and the user's role has since changed. The graph doesn't know that. It carries the edge forward indefinitely, and the detections built on top of the graph fire on stale relationships forever.

The honest fix is to time-decay every edge, which means edges are no longer simple booleans; they're probabilities with timestamps. Once you've done that, you've reinvented a time-series store with a join, which is what a SIEM is.

## Wall three: agents don't want the graph

This is the new wall, and it's the one nobody saw coming when the graph pitch was being refined in 2019.

An LLM investigating an alert doesn't need the entire enterprise graph. It needs the small evidence DAG relevant to *this case*, materialised at investigation time, scoped to the entities the alert touches and the two or three hops the agent decides are useful. The agent constructs that DAG by issuing targeted queries against the underlying data sources directly, lazily, only as it needs them.

The persistent graph is a precomputed answer to the wrong question. The agent doesn't want "all the relationships we've ever observed." It wants "the relationships that are true *right now*, for the five entities I'm looking at, with the freshness of a query I just issued."

The graph database optimised for read-heavy traversal across long-lived edges. The agent's access pattern is the opposite: write-rare, read-once, evidence is thrown away when the case closes. The architectures are not compatible.

## What replaces it

The pattern that's working is unglamorous. For each case, the agent constructs an **ephemeral evidence DAG** — a small, case-scoped, in-memory graph built from live queries against the source telemetry. It carries the entities the alert touches, the immediate relationships pulled from EDR/IDP/cloud APIs, and the pivots the agent has explored. When the case closes, the DAG is serialised as part of the handoff packet (with its `ruled_out` and `failed_pivots` blocks intact) and the in-memory representation is discarded.

The persistent layer is the source data, queried federatedly. The temporary layer is the DAG. There is no global graph in between.

This works because it inverts the staleness problem. Edges in an ephemeral DAG are by definition fresh — they were observed during this investigation, against the live data, with timestamps. The DAG knows what it doesn't know. The enterprise graph doesn't.

## When the graph still works

In two cases, narrowly. Asset inventory is a graph problem, and it's a small, slow-moving one with a hand-written schema; the graph databases that survive in security tend to be these. Identity governance — who has access to what, granted by whom, expiring when — is also a small enough graph to maintain manually. These are the only two places I'd reach for one in 2026.

Anywhere else, the answer is a SIEM (or columnar log store) for retention, and an agent that builds the graph it needs at investigation time.

## What this means for vendor evaluation

Three quick tests when someone pitches you a graph-backed AI SOC platform:

1. Ask what happens when the EDR vendor renames a field next quarter. If the answer involves a migration job, the architecture is fragile by design.
2. Ask how edges decay. If there isn't a per-edge TTL or a time-weighted relationship score, the graph will be lying to you within a month.
3. Ask to see an example investigation packet. If the graph is the artifact, the agent is reading the wrong thing. If the artifact is a case-scoped DAG with provenance, they figured it out.

The graph database is a beautiful idea that has been rejected by reality three times. It's about to be rejected a fourth. Build your agents around the shape of the actual investigation, not the shape of the data on a whiteboard.

---

*Disagree? Send the counter-argument: hello@tandemtrace.ai.*
