Zero-Trust for Agents (Paper): Capability Grants, Tripwires, Immutable Logs
New preprint (engrXiv DOI): https://doi.org/10.31224/5792
Agentic AI is powerful—and risky—once tools and data are in reach. This preprint lays out a Zero-Trust architecture for AI agents so you can move fast with guardrails: scoped capability grants, runtime tripwires, and immutable audit logs. It maps controls directly to EU AI Act Article 14 (human oversight) and the NIST AI RMF (Govern / Map / Measure / Manage), and includes a practical threat model, a control↔requirement matrix, KPI/SLOs, and a micro-evaluation harness based on OWASP LLM01/LLM06 and Salesforce-style prompt-injection patterns.
What’s the blueprint?
- Capability grants (least privilege): short-lived, scoped tokens; deny-by-default tools/data; allowlists and ABAC/FGA for precision.
- Tripwires (runtime control): rules + anomaly detection to gate or block actions; human cosign for sensitive ops; kill-switch with p95 override latency SLO.
- Immutable logs (accountability): append-only evidence of prompts, tool calls, outputs, overrides; replay/rollback for fast incident recovery.
Why it matters (regulatory fit)
- EU AI Act, Art. 14: effective human oversight, the ability to interrupt or stop, and documentation of oversight activity.
- NIST AI RMF: continuous risk measurement and mitigation across the lifecycle.
This architecture operationalizes both—without slowing delivery.
What’s inside the paper
- Threat Model (½ page): attacker goals, vectors (LLM01/LLM06), trust boundaries, controls, residual risk.
- Control↔Requirement Matrix: how capability tokens, tripwires, logs, and overrides satisfy Art. 14 and AI RMF functions.
- KPI/SLOs: p95 override latency, % actions gated, audit-log completeness, incident MTTR, token hygiene.
- Micro-evaluation harness: public, reproducible prompts (OWASP + Salesforce-style) to test the control plane’s block rate, FP rate, and latency.
Who this helps
Security architects, platform owners, SRE/ML Ops leads, and compliance/assurance teams who need deployable guardrails for agentic AI—now, not later.
Read the preprint (engrXiv DOI): https://doi.org/10.31224/5792