The handoff packet: a proposed schema for AI→human escalations

Every autonomous SOC vendor has a triage agent. Almost none of them have agreed on what the agent should hand the human when it decides to escalate. The product UIs all look different, the JSON looks different, the field names look different, and the analyst on the receiving end pays the cost in lost context every time they switch tools.

This is the thing that should be standard. We're proposing a schema for it.

Why the packet matters more than the UI

Most of the agentic-SOC pitches focus on the agent's decision. The harder problem is what the agent says when it can't decide. An escalation that arrives as a paragraph of summary plus a link to the original alert is, structurally, a worse experience than the original alert was — the human now has to read the agent's reasoning and re-derive the underlying evidence.

A correctly-shaped escalation is the opposite. It gives the human the verdict first, the confidence calibration second, the evidence graph third, and the agent's failed pivots fourth — so that the first 30 seconds of human attention go to the conclusion, the next 30 go to verifying it, and the next 60 go to extending the investigation rather than restarting it.

The packet is the contract.

Required fields

{
  "packet_version": "0.1",
  "case_id": "tt-2026-05-17-00481",
  "verdict": {
    "decision": "escalate",
    "label": "suspicious",
    "confidence": 0.62,
    "confidence_floor": 0.85,
    "rationale": "Identity baseline mismatch + atypical geo + sensitive scope.",
    "reversibility": "soft"
  },
  "evidence": {
    "primary_signals": [...],
    "supporting_signals": [...],
    "ruled_out": [...]
  },
  "pivots": {
    "attempted": [...],
    "succeeded": [...],
    "failed": [...]
  },
  "recommended_next_action": {
    "action": "review_with_user",
    "owner_role": "tier_2_analyst",
    "expected_duration_seconds": 180
  },
  "audit": {
    "agent_version": "trace-7.4.1",
    "model": "claude-opus-4-7",
    "tool_calls": 11,
    "total_runtime_ms": 4380,
    "cost_usd": 0.043
  }
}

Five blocks, six if you count the wrapper. Everything below is a description of why each block exists, not how it should be rendered.

`verdict`

The single field with the most leverage. decision is one of close, suspend, or escalate — and yes, three is enough. suspend is the case where the agent has a verdict it isn't allowed to act on (because the policy threshold isn't met) but it would be a mistake to bury it in the closed-cases archive.

confidence is the calibrated probability the agent assigns to its own decision. confidence_floor is the policy threshold for the action. If confidence < confidence_floor, the agent must escalate — this is non-negotiable and should be enforced at the schema level, not the prompt level.

reversibility is hard, soft, or none. A hard-reversibility action (disabling a user, isolating an endpoint) requires a much higher confidence floor than a soft one (closing an alert as benign). The packet carries this so the receiver can audit it.

`evidence`

Three buckets, not one. primary_signals are the artifacts the verdict was built on. supporting_signals are corroborating artifacts the agent pulled but the verdict doesn't depend on. ruled_out is the most valuable block and the one nobody publishes — the hypotheses the agent considered and discarded, with the reason.

A human reviewing an escalation needs to know what the agent already eliminated, otherwise they'll redo the same work. This is the single biggest source of wasted analyst time in human-AI SOC handoffs today.

`pivots`

The agent's trace through the data. attempted is the chronological list of queries, integrations called, lookups performed. succeeded is the subset that returned non-empty answers; failed is the subset that timed out, returned errors, or hit empty results.

Failed pivots are diagnostic gold. An agent that escalated because the EDR API was rate-limited is a fundamentally different escalation than one that escalated because the EDR returned data that didn't support a verdict. The first is an infrastructure problem; the second is a real ambiguity. The packet has to distinguish them.

`recommended_next_action`

Optional, but high-value. The agent has more state than the human; it should commit to a specific next step rather than hand off an open-ended "please review." owner_role is a role, not a person — the routing layer assigns the actual analyst.

expected_duration_seconds is the agent's estimate of how long the human work will take, which is the input to capacity planning. Without it, SOC managers are guessing at how many escalations per hour the human queue can absorb.

`audit`

The fields a regulator, a customer's security team, or a future incident reviewer will want. agent_version and model are required for reproducibility. tool_calls, runtime_ms, and cost_usd are how a CISO budgets the autonomous layer; without them, the agent is a black box with a flat invoice.

What we're not including

We deliberately omitted three things people will ask for.

No free-text narrative summary. The summary is a rendering of the packet, not a field in it. If you put narrative text in the schema, every downstream tool gets a different summary and the packet stops being a contract.

No screenshots or rendered HTML. Same reason. The packet is data; rendering belongs in the tool that receives it.

No "agent's reasoning chain." Chain-of-thought is a property of the model, not the case. If it's useful to ship, it goes under audit as a debug field, not as evidence.

How to use this

If you're building an AI SOC product, you can adopt this schema directly. The fields are deliberately small in number and high in leverage — six top-level blocks, none of them controversial, all of them recoverable from a well-structured agent. If you implement it and find a block we missed, tell us.

If you're a SOC lead evaluating vendors, ask each one for an example escalation packet. If it doesn't have ruled_out and failed_pivots, the agent's investigation isn't visible to your team. You will pay for that in re-derivation cost on every escalation, forever.

A schema is not a product. But a missing schema is a tax — and the SOC has paid that tax for a decade.

Comments and counter-proposals welcome: [email protected]. We'll version the schema in public.

The handoff packet: a proposed schema for AI→human escalations.

Why the packet matters more than the UI

Required fields

verdict

evidence

pivots

recommended_next_action