Teaching AI Agents When to Act: Building a Policy Store for the SOC

Teaching AI Agents When to Act: Building a Policy Store for the SOC


Every SOC team dreams of automation that acts without hesitation — but never without permission. We built our Policy Store and Policy Engine to do exactly that: a version-controlled framework that teaches AI agents when to act, when to escalate, and when to roll back. It’s now part of our weekly battle rhythm.


The Core Idea

Instead of hardcoded SOAR playbooks, we maintain a living repository of policies. Each policy defines:

  • Trigger conditions — when action is allowed (for example, “ransomware detection with confidence > 0.9”)
  • Permitted actions — what the agent may do (isolate host, disable account, block IP)
  • Escalation thresholds — when to defer to a human
  • Rollback logic — how to safely reverse a change

Policies are versioned, peer-reviewed, and validated before an agent ever executes them. This keeps autonomy transparent and auditable.

How this differs from a SOAR: Traditional SOAR systems rely on static if-this-then-that playbooks. They execute fixed sequences of actions tied to specific alerts, often without context beyond the detection rule. A policy-driven AI agent, by contrast, evaluates the alert’s confidence, context, and history before deciding. It queries the policy store in real time, applies conditional logic dynamically, and adapts its choice — permit, escalate, or deny — based on defined thresholds. Where SOARs automate tasks, policy agents automate judgment.


How We Built It

1. The Policy Store

We started with a simple Git repository that holds every approved policy in YAML format. Each file is validated by a JSON schema and checked automatically through pre-commit hooks. The schema enforces structure — no malformed logic, no missing rollback paths.

Every pull request goes through code review. The repository becomes our doctrine: the written law of automation.


2. The Policy Engine

Next, we built a microservice that evaluates alerts against stored policies. We used FastAPI for the API layer and Open Policy Agent (OPA) for deterministic decision logic.

When an alert arrives, the engine matches it against stored triggers and runs OPA evaluations. The engine then returns a clear decision:

  • Permit — the agent can act.
  • Escalate — the event requires human review.
  • Deny — no match or insufficient confidence.

Every decision is logged with a policy hash, timestamp, and input summary for later audit or replay testing.


3. Our Battle Rhythm

We treat policy management as a disciplined cycle:

  • Daily: review policy pull requests and CI test results.
  • Weekly: run OPA tests on recent alert data and assess escalation accuracy.
  • Monthly: threat-model the policy logic itself, rotate engine images, and verify reproducibility.

The rhythm keeps automation aligned with analyst judgment. Every update strengthens — not loosens — control.


What This Changes

With the Policy Store in place, agents act from the same source of truth as analysts. We no longer debate whether an action is “safe.” If it’s encoded and approved, it’s executable. If it isn’t, the engine says no.

This isn’t about full autonomy. It’s about codified judgment — a way for machines to operate inside human intent.


Takeaway

If your SOC is exploring AI-driven response, don’t start with models. Start with policy — make it versioned, explainable, and reviewable. That’s how you teach an agent when to act.

To view or add a comment, sign in

More articles by Mathew Davis

Explore content categories