Chapter 01The Agentic Shift

From Chat to Action

In 2023, your employees started using ChatGPT. You wrote an AI policy. In 2024, your teams adopted copilots. You updated the policy. In 2025, Model Context Protocol (MCP) gave AI models the ability to use tools — read databases, send emails, call APIs, modify files. Anthropic's Agent-to-Agent (A2A) protocol let them talk to each other.

In 2026, AI agents are no longer answering questions. They are executing work. Booking meetings. Approving invoices. Deploying code. Negotiating with other agents across organizational boundaries. The shift isn't incremental — it's categorical.

"Your employees are already using AI agents. You just don't know which ones, with what data, at what cost, and at what risk."

This chapter explains why the agentic shift is fundamentally different from previous AI waves, why your existing controls don't work, and what happens to organizations that don't adapt.

Four Eras of Enterprise AI

Era	Year	Model	Risk Surface	Enterprise Response
Chat	2023	Human asks, AI answers	Data leakage via prompts	Block or ignore
Copilot	2024	AI assists an individual	Code quality, IP concerns	Pilot programs
Agent	2025	AI acts autonomously with tools	Unauthorized actions, data access, cost runaway	Policy documents (insufficient)
Industrial	2026	AI teams with cross-org federation	Identity fraud, privilege escalation, compliance violations, cross-boundary data flow	Governance architecture (required)

Each era expanded the blast radius of AI. A chatbot can leak information. A copilot can write bad code. An agent can take action — send money, delete data, sign contracts. A federated agent network can do all of this across organizational boundaries with other organizations' agents.

The difference isn't just capability — it's accountability. When a chatbot gives wrong advice, a human is still in the loop. When an agent executes a wire transfer based on a spoofed instruction from another organization's agent, who is responsible? Your CISO? The agent's developer? The LLM provider? The other organization?

The Shadow AI Crisis

98%

of organizations report unsanctioned AI use (Vectra, 2025)

90%

of AI use cases stuck in pilot mode (McKinsey, 2025)

40%

of enterprise apps will include AI agents by 2026 (Gartner)

$4.6M

average cost of a shadow AI breach (IBM, 2025)

Shadow AI isn't a future threat — it's the current state of most enterprises. Ninety-eight percent of organizations report unsanctioned AI use (Vectra, 2025). Nearly 47% of generative AI users access tools through personal accounts, completely bypassing enterprise controls. 77% of employees who use AI tools paste sensitive business data into them. And 90% of CISOs say shadow AI is a significant concern — yet fewer than 30% have implemented technical controls beyond policy statements.

Shadow AI-related breaches now carry a cost premium: $4.63 million versus $3.96 million for standard breaches (IBM, 2025). They account for 20% of all breach incidents and growing. The problem isn't that employees are using AI — it's that they have to, because the official channels are too slow, too restrictive, or nonexistent. Shadow AI is a symptom of governance failure, not user misbehavior.

What Shadow AI looks like in 2026

Sales team

A sales rep connects an AI agent to HubSpot using their personal API key. The agent has full CRM read/write access. It sends personalized emails to 500 prospects with hallucinated product claims. The rep leaves the company. The agent keeps running for 3 weeks before anyone notices.

Engineering team

A senior engineer deploys a coding agent with access to production repositories. The agent submits a pull request that passes CI/CD but introduces a subtle vulnerability. The agent's execution history isn't logged anywhere your SOC can see. Six months later, the vulnerability is exploited.

Finance team

The CFO's assistant uses an AI agent to analyze quarterly results from a shared drive. The agent sends the analysis to an external email address the assistant configured for "convenience." The data includes pre-earnings financial results. Nobody knew the agent had email access.

These aren't hypothetical scenarios. They are composites of real incidents reported by enterprises in 2025. The common thread: no identity, no authorization, no audit trail, no kill switch.

Why Policy Documents Fail

The instinctive response to Shadow AI is to write a policy. "Employees must not use unapproved AI tools." "All AI use must be pre-approved by IT." "Data must not be shared with external AI services."

These policies share three fatal flaws:

1. Policies are aspirational, not enforceable

A policy that says "agents must not access PII without approval" has no enforcement mechanism. There is no gate between the agent and the PII. The policy relies on humans reading it, understanding it, and voluntarily complying. In practice, the policy lives in a SharePoint folder that nobody reads.

2. Policies are static, agents are dynamic

An AI agent's behavior changes based on its prompt, its tools, its model version, and the data it encounters. A policy written for GPT-4 may not apply to Claude Opus 4. A policy for a sales agent doesn't cover what happens when that agent delegates work to an engineering agent. Policies can't keep up with the combinatorial explosion of agent behaviors.

3. Policies don't compose across organizations

When your agent talks to a partner's agent via A2A or SLIM protocol, whose policy applies? Your data residency policy says "EU only." Their agent processes data in US-East. There's no runtime mechanism to detect or prevent this. Cross-organizational trust requires infrastructure, not documents.

The fundamental insight

Governance isn't a policy document. It's architecture. It's infrastructure that makes compliance automatic, not aspirational. The answer to "agents must not access PII" isn't a PDF — it's a runtime authorization check that blocks the tool call before PII is touched, logs the attempt, and alerts the security team.

The Governance Gap

Enterprises have mature governance for humans (IAM, RBAC, audit logs, access reviews). They have mature governance for software (CI/CD gates, code review, vulnerability scanning). They have almost nothing for AI agents.

Governance Dimension	Humans	Software	AI Agents
Identity	SSO, badges, biometrics	Service accounts, certs	Shared API keys (if anything)
Authorization	RBAC, least privilege	IAM roles, scoped tokens	Full access or nothing
Audit	Login logs, access reviews	CI/CD logs, SIEM	Console.log (if lucky)
Compliance	Training, attestation	SAST, DAST, pen tests	Nothing
Kill switch	Disable account	Rollback deployment	Hope someone finds the terminal
Cross-org trust	Contracts, NDAs	mTLS, API keys	Trust the other org's word

The gap isn't a matter of missing features — it's a missing category. AI agents are a new class of actor in the enterprise, alongside humans and software. They need their own identity system, their own authorization model, their own audit trail, and their own compliance framework.

What Happens to Organizations That Don't Adapt

Scenario A: The ban

The CISO bans all AI agents. Shadow AI goes deeper underground. Competitors who govern agents properly gain 3-5x productivity advantages. The best engineers leave for companies where they can use modern tools. The organization falls behind and blames "AI hype" for not delivering value.

Scenario B: The free-for-all

The CIO approves AI agents without governance. A data breach occurs within 6 months. The average cost is $4.4M (IBM, 2025). The regulatory fine under EU AI Act Article 99 can reach 3% of global annual turnover. The CISO is replaced. The new CISO bans all AI agents (see Scenario A).

Scenario C: Governed deployment

The organization deploys AI agents with governance architecture. Every agent has an identity. Every tool call is authorized. Every action is audited. Compliance is automatic. The CISO sleeps at night. The CIO delivers ROI. The CEO reports AI productivity gains to the board. The board asks "why didn't we do this sooner?"

This Blueprint is for Scenario C

The remaining chapters provide the framework, architecture, and implementation playbook for governed AI agent deployment. Not theory — infrastructure.

The Regulatory Pressure

Regulators are no longer "watching and waiting." The EU AI Act entered into force on 1 August 2024 and will be fully applicable on 2 August 2026 — five months from now. Compliance experts estimate 32-56 weeks minimum to achieve compliance for high-risk AI systems. If you haven't started, you're already behind.

The OWASP Foundation released its Top 10 for Agentic Applications (2026) in December 2025 — the first security framework specifically designed for autonomous AI agents, reflecting input from over 100 security researchers. The #1 risk: Agent Goal Hijacking — attackers manipulating agent objectives through poisoned inputs. According to Dark Reading, 48% of cybersecurity professionals now identify agentic AI as the number-one attack vector heading into 2026 — outranking deepfakes, ransomware, and supply chain compromise.

Financial regulators (DORA, SOX) already require operational resilience for automated systems. Healthcare regulators (HIPAA) require access controls on any system that touches PHI. These aren't new requirements — they're existing requirements applied to a new category of actor.

Regulation	Agent-Relevant Requirement	Penalty for Non-Compliance
EU AI Act	Art. 14: Human oversight of high-risk AI. Art. 15: Accuracy and robustness.	Up to 3% global annual turnover
GDPR	Art. 25: Data protection by design. Art. 35: Impact assessment for automated processing.	Up to 4% global annual turnover or €20M
HIPAA	164.312: Technical safeguards for any system accessing PHI.	$100-$50,000 per violation, up to $1.5M/year
SOX	Section 404: Internal controls over financial reporting.	Criminal penalties for executives
DORA	Art. 11: Operational resilience for ICT-dependent functions.	Up to 2% global annual turnover
NIS2	Art. 21: Cybersecurity risk management for essential services.	Up to €10M or 2% global annual turnover

The question is no longer "should we govern AI agents?" It's "how quickly can we get governance infrastructure in place before the next audit?"

Chapter Summary

The agentic shift is not an incremental evolution — it's a categorical change in how AI interacts with enterprise systems. AI agents are autonomous actors that need their own identity, authorization, audit, and compliance infrastructure. Policy documents don't work because they're aspirational, static, and don't compose across organizations. The governance gap is a missing category, not a missing feature. Regulation is already here. The only viable path is governed deployment — Scenario C.

The next chapter introduces the AI Governance Maturity Model — a framework for assessing where your organization stands today and what "good" looks like at each stage of the journey.

Chapter 02The Maturity Model

Why a Maturity Model?

Governance isn't binary. You don't go from "ungoverned" to "fully compliant" in one step. Organizations need a framework to assess where they are, define where they need to be, and chart the path between — with measurable milestones at each stage.

The AI Governance Maturity Model (AGMM) defines five levels. Each level builds on the previous one. Each level delivers tangible value. The goal isn't perfection — it's continuous improvement with verifiable progress.

The Five Levels

Level 1 — Ad-hoc

Characteristics: Individual employees use AI tools. No central inventory. No policy beyond "don't share secrets." No audit trail. Management doesn't know which AI tools are in use or what data they access.

Dimension	State at Level 1
Inventory	Nobody knows what AI tools are in use
Identity	Shared API keys or personal accounts
Authorization	Full access or no access
Audit	None, or application-level logs only
Compliance	AI not mentioned in compliance program
Cost control	Unknown spend, charged to individual credit cards
Incident response	"Turn it off" (if anyone knows where "it" is)

Where most enterprises are in 2026

McKinsey's 2025 State of AI report found that while 23% of organizations are scaling agentic AI, 90% of transformative use cases remain stuck in pilot mode. Only 37% of organizations have AI governance policies (ISACA, 2025). Gartner predicts over 40% of agentic AI projects will fail by 2027 due to governance and control issues. If your organization is at Level 1, you're not behind — you're normal. But "normal" is no longer safe. Governance spending is projected to reach $492 million in 2026 (Gartner) because the market has realized the gap is existential, not optional.

Level 2 — Experimental

Characteristics: IT acknowledges AI usage. A pilot program exists. Some tools are sanctioned. An AI policy is written. But enforcement is manual and sporadic. Audit trails exist for sanctioned tools only.

Dimension	State at Level 2
Inventory	Partial — sanctioned tools known, shadow AI still exists
Identity	Service accounts for official tools, personal accounts for the rest
Authorization	Coarse-grained (admin/user), per-application
Audit	Application-level logs for sanctioned tools
Compliance	AI mentioned in policy, but no technical controls
Cost control	Departmental budgets, no per-agent attribution
Incident response	Disable the service account (1-4 hour response)

Level 2 is where most "AI-forward" enterprises land after their first governance initiative. It feels like progress — and it is — but it leaves critical gaps. Shadow AI still exists alongside the official program. Authorization is too coarse to enforce least-privilege for agents. Compliance is based on policy, not enforcement.

Level 3 — Managed

Characteristics: Central AI platform with agent inventory. Per-agent identity (service accounts with scoped permissions). Tool-level authorization policies. Centralized audit logging. Cost attribution per agent. Manual compliance checks.

Dimension	State at Level 3
Inventory	Complete — all agents registered in central platform
Identity	Per-agent service accounts with unique identifiers
Authorization	Per-tool policies (e.g., "Agent X can read CRM but not write")
Audit	Centralized, searchable audit logs for all agent actions
Compliance	Manual compliance checks; evidence collection is semi-automated
Cost control	Per-agent cost tracking and budget alerts
Incident response	Kill switch per agent, team, or tenant (seconds, not hours)

Level 3 is the minimum for production deployment in non-regulated industries. You know what agents exist, what they can do, what they did, and how much it cost. You can stop any agent instantly. This is the "table stakes" level for taking AI agents seriously.

Level 4 — Governed

Characteristics: Cryptographic agent identity. Fine-grained authorization with default deny. Cascading governance policies from organization to individual agent. Automated compliance with framework-specific controls. Mathematical verification of agent behavior. Tamper-evident audit trails.

Dimension	State at Level 4
Inventory	Complete with lifecycle management (create, deploy, pause, retire)
Identity	Cryptographic (SPIFFE IDs, Verifiable Credentials, JWT-SVIDs)
Authorization	Per-tool-call authorization (OpenFGA/Zanzibar). Default deny. 190+ tool policies.
Audit	Hash-chained, HMAC-verified, tamper-evident. SIEM-exportable. Separate audit DB.
Compliance	Automated: governance packs per framework (GDPR, HIPAA, SOX, EU AI Act, DORA). Evidence auto-collected.
Cost control	Per-call metering, per-agent budgets, spending policies with approval gates
Incident response	Kill switch hierarchy (agent → team → tenant). Cascading. Auto-notification.
Verification	Multi-LLM cross-checking (PVP). Policy-as-Code with cryptographic execution certificates.

Level 4 is the target for regulated industries

If your organization is subject to GDPR, HIPAA, SOX, DORA, EU AI Act, or NIS2, Level 4 is not aspirational — it's required. The specific governance controls map directly to regulatory obligations. Chapter 4 (Regulatory Landscape) provides the detailed mapping.

Level 5 — Industrial

Characteristics: Cross-organizational agent federation. Trust verification across company boundaries. Per-call skill marketplace. Agent reputation scores. Automated compliance certification. The "Internet of Agents" operating at industrial scale.

Dimension	State at Level 5
Inventory	Federated directory across organizations (AGNTCY, OASF)
Identity	Cross-org verification via SPIFFE trust bundles + OAuth 2.0 Token Exchange
Authorization	Cross-org TBAC (Tool-Based Access Control) with delegation chains
Audit	Cross-org audit correlation. Federated evidence packages.
Compliance	Governance certifications (e.g., "GDPR Verified Agent"). Cross-org compliance attestation.
Federation	SLIM protocol for cross-org messaging. MLS encryption (RFC 9420). Circuit-breaker health monitoring.
Economics	Per-call skill marketplace. Agent trust scores. Reputation-weighted routing.

Level 5 is emerging. Standards are being defined (AGNTCY/Cisco, Linux Foundation AI Card). Early implementations exist. Most organizations should target Level 4 first and plan for Level 5 as the ecosystem matures.

The Maturity Assessment Matrix

Use this matrix to assess your organization's current state. For each dimension, identify which level best describes your current reality — not your aspirations or your policy documents, but what actually happens day-to-day.

Dimension	L1	L2	L3	L4	L5
Agent Inventory	Unknown	Partial	Complete	+ Lifecycle	+ Federated
Identity	None	Shared keys	Per-agent ID	Cryptographic	Cross-org
Authorization	None	Admin/User	Per-tool	Per-call + TBAC	Cross-org delegation
Audit	None	App-level	Centralized	Hash-chained	Federated
Compliance	None	Policy doc	Manual checks	Automated packs	Cross-org certs
Cost Control	Unknown	Departmental	Per-agent	Per-call + gates	Marketplace
Incident Response	Find the terminal	Disable account	Kill switch	Kill hierarchy	Cross-org halt

Scoring: Count the number of dimensions at each level. Your overall maturity is the lowest level where you have all dimensions covered. If your identity is at L3 but your audit is at L1, your effective maturity is L1. The chain is only as strong as its weakest link.

Take the full interactive assessment

This table is a simplified version. The MeetLoyd AI Governance Readiness Assessment provides a detailed, weighted evaluation across 25 criteria with a personalized report and recommendations.

The Path Forward

The maturity model isn't a scorecard — it's a roadmap. Each level is a stable plateau where the organization delivers value while building toward the next level. You don't need to reach Level 4 before deploying agents. You need to know you're at Level 1, have a plan to reach Level 3 in weeks (not years), and a path to Level 4 when regulation demands it.

Common transition patterns

L1 → L3 in 2-4 weeks (platform-assisted)

Deploy a managed agent platform with built-in identity, authorization, and audit. Skip Level 2 entirely — there's no value in partial governance. A good platform gives you Level 3 on day one.

L3 → L4 in 4-8 weeks (governance activation)

Enable compliance packs for your regulatory frameworks. Upgrade identity to cryptographic. Activate per-call authorization with default deny. Turn on hash-chained audit. The infrastructure was there from L3 — you're activating controls, not building them.

L4 → L5 when the ecosystem is ready

Cross-org federation requires the other organization to be at L4 too. Standards (SLIM, AGNTCY, OASF) are maturing. Early adopters are deploying federation bridges. Plan for it, but don't block on it.

Chapter Summary

The AI Governance Maturity Model provides five levels of increasing capability: Ad-hoc, Experimental, Managed, Governed, and Industrial. Most enterprises are at Level 1-2. Regulated industries need Level 4. The path from Level 1 to Level 3 can take weeks with the right platform. The path from Level 3 to Level 4 is primarily about activating governance controls that already exist in the infrastructure.

The next chapter deep-dives into the Five Pillars of AI Governance — Identity, Authorization, Verification, Audit, and Federation — the architectural foundations that make Level 4+ possible.

Chapter 03The Five Pillars

Chapter 2 introduced the Maturity Model. Level 4 (Governed) requires five architectural capabilities that most enterprise software stacks don't provide for AI agents. This chapter examines each pillar in depth: what it is, why it matters, what "good" looks like, what "bad" looks like, and how to implement it.

These pillars are not independent. Identity feeds into Authorization. Authorization decisions are captured by Audit. Verification samples from Audit data. Federation extends all four across organizational boundaries. The pillars compose — and they must all be present for governance to work.

Pillar 1: Identity

The principle: Every AI agent must have a unique, cryptographically verifiable identity — not a shared API key, not a service account, not "the company's OpenAI key."

Why Identity Matters

Without identity, you cannot answer: "Which agent did this?" When an unauthorized data access appears in your SIEM, can you trace it to a specific agent, deployed by a specific team, in a specific workspace? Or does the log show "api-key-prod-2024" — a credential shared by 47 agents?

Agent identity is the foundation of everything else. Authorization checks "can agent X do Y" — but if you can't identify agent X, authorization is meaningless. Audit logs record "agent X did Y" — but if X is a shared key, the log is useless for incident response.

What "Good" Looks Like

Capability	Standard	What It Does
Unique ID	SPIFFE	Every agent gets a URI identity: `spiffe://domain/tenant/{id}/agent/{id}`. Globally unique. Revocable. Your IAM can reference it.
Signed Credentials	W3C Verifiable Credentials	Agent carries a cryptographically signed "badge" listing its tools, permissions, and governance status. Third parties can verify without calling back to the issuer.
Short-lived Tokens	JWT-SVID	Agent authenticates with a JWT signed by the platform's CA. 1-hour TTL by default, 24-hour max. Stateless verification — no DB lookup needed.
Cross-org Trust	SPIFFE Trust Bundles	When your agent talks to a partner's agent, trust is verified via exchanged SPIFFE trust bundles — not a phone call to their IT department.
Delegated Access	OAuth 2.0 Token Exchange (RFC 8693)	Agent A can delegate limited capabilities to Agent B via token exchange. Scoped, time-limited, auditable. No shared secrets.

What "Bad" Looks Like

Shared API keys

50 agents share one OpenAI key. A breach exposes all 50. You can't revoke one without disrupting the other 49. Audit logs show the key, not the agent. Incident response means rotating the key for everyone.

Service accounts without scoping

Each agent has a service account, but all service accounts have the same permissions. One compromised agent escalates to full access. "Least privilege" is aspirational, not enforced.

Implementation Checklist

Every agent has a unique, persistent identifier (not generated per-session)
Identity is cryptographically signed (not self-asserted)
Credentials are short-lived with automatic renewal
Identity is visible in your IAM / security dashboard
Revocation is instant (seconds, not days)
Cross-organization identity verification uses trust bundles, not manual configuration

Pillar 2: Authorization

The principle: Every tool call an agent makes must pass through a policy check. Default deny. Fail closed.

Why Authorization Matters

An AI agent with access to your CRM, email, and code repositories is more dangerous than any individual employee — because it can act at machine speed, 24/7, without fatigue or second-guessing. A human might pause before emailing 10,000 customers. An agent will not — unless a policy check stops it.

Authorization for agents is fundamentally different from authorization for humans. Humans have 10-20 applications they use daily. An agent can invoke 190+ tools in a single session. The authorization model must be per-tool-call, not per-application.

The Three Enforcement Modes

Production authorization systems need a progressive rollout model. You don't flip from "no enforcement" to "hard blocks" overnight — that breaks running agents and erodes trust.

Mode	What Happens on Deny	When to Use
Audit	Denial logged, request allowed	Initial rollout. Discover what your agents actually do before blocking anything.
Warn	Denial logged + warning header, request allowed	Progressive tightening. Teams see warnings and can fix permissions before enforcement.
Enforce	Denial logged, request blocked	Production governance. Agent cannot proceed without proper authorization.

The progressive activation pattern

Start in warn mode. Let it run for 2 weeks. Review the denial logs. Fix legitimate access gaps (agents that need permissions they don't have). Only then move to enforce. This is how you get governance adoption without breaking production — which is the number one reason governance initiatives fail.

What "Good" Looks Like

Capability	Standard	What It Does
Per-tool policies	OpenFGA (Zanzibar)	Every tool has a policy. "Agent X can `read` CRM contacts but not `write`." Granularity at the tool + resource + action level.
Default deny	Zero Trust	If no policy grants access, the request is denied. No implicit permissions. No "admin" backdoor.
Cascading policies	Governance hierarchy	Organization → Workspace → Team → Agent. Policies cascade and the most restrictive level wins.
Delegation control	TBAC	Tool-Based Access Control for cross-agent delegation. Agent A can grant Agent B limited tool access via scoped tokens. The delegation chain is auditable.
Kill switch	Emergency halt	Instantly revoke all permissions for an agent, team, or entire tenant. Cascade-enabled. Notification channels. Requires admin approval to restart.

What "Bad" Looks Like

Binary access (admin or nothing)

Agent can either access "everything" or "nothing." No granularity. A sales agent that needs CRM read access also gets email send, file delete, and database write. One tool's permissions bleed into every other tool.

Static configuration files

Permissions defined in YAML at deploy time, never updated. Agent's actual needs drift from its configuration. Nobody reviews. "Least privilege" decays into "maximum privilege we configured 6 months ago."

Implementation Checklist

Every tool invocation passes through a policy check (not just application-level auth)
Default deny — no implicit permissions
Enforcement modes support progressive rollout (audit → warn → enforce)
Policies cascade from organization to individual agent
Kill switch available at agent, team, and organization level
Delegation chains are scoped, time-limited, and auditable
Policy changes take effect immediately (not on next deploy)

Pillar 3: Verification

The principle: Don't trust that agents followed policy — verify it. Mathematically, not anecdotally.

Why Verification Matters

Authorization checks individual tool calls. But compliance often requires reasoning about sequences of actions. Did the agent access PII in step 1 and then send data to a US endpoint in step 3? Did total spending across all steps exceed the budget? Did the same agent both approve and execute a payment (separation of duties violation)?

Per-call authorization can't catch these — it only sees one call at a time. Verification adds a cross-step analysis layer that examines agent behavior over a session or task.

Two Modes of Verification

Mode A: Policy Analysis (deploy-time)

Before agents run, analyze the policy set itself for problems:

Contradictions: Policy A allows what Policy B denies. Which wins?
Privilege escalation: Agent A can grant permissions to Agent B, who can grant to Agent C, creating an unintended chain.
Fail-open gaps: Resources with no deny rule — if the default isn't properly configured, they're unprotected.
Redundant rules: Rules subsumed by other rules, adding complexity without security value.

This is what AWS Cedar's automated reasoning does for IAM policies. For AI agents, the stakes are higher because the action space is larger and the consequences are faster.

Mode B: Execution Verification (runtime or post-hoc)

After (or during) agent execution, verify that the sequence of actions complied with policy:

Budget accumulation: Total cost across all steps stayed within budget
Data flow tracking: PII accessed in step N was not transmitted in step M
Separation of duties: No agent both proposed and approved a sensitive action
Cross-step constraints: Any rule that requires reasoning about the full session

Verification is complementary to authorization

Authorization is a gate — it blocks individual unauthorized actions in real-time. Verification is a proof — it demonstrates that the full sequence of actions was compliant. You need both. Authorization without verification misses cross-step violations. Verification without authorization catches problems too late.

Execution Certificates

When verification passes, the system issues a cryptographic execution certificate — a signed attestation that the agent's session was verified against a specific set of policies. This certificate can be:

Attached to audit records as compliance evidence
Presented to auditors during SOC 2 or regulatory reviews
Verified independently by any party with the platform's public key
Used as input for agent reputation systems

Implementation Checklist

Policy-as-code with a formal language (not natural language descriptions)
Deploy-time analysis catches contradictions and gaps before agents run
Runtime or post-hoc verification checks cross-step constraints
Verification results are cryptographically signed (execution certificates)
Certificates are independently verifiable (no callback to issuer required)
Verification failures trigger alerts and can block continued execution

Pillar 4: Audit

The principle: Every action, every decision, every tool call — logged, integrity-protected, and queryable. Not for compliance theater — for incident response.

Why Audit Matters

When (not if) something goes wrong with an AI agent, the first question is "what happened?" If your audit trail is console.log statements scattered across 50 microservices, the answer is "we don't know" — and the average time-to-recover doubles.

SOX requires immutable financial audit trails. HIPAA requires access logs for PHI. GDPR requires processing records. These aren't new requirements — but AI agents generate orders of magnitude more auditable events than human users. An agent that runs for 30 minutes might invoke 50 tools, access 200 records, and make 15 decisions. Your audit system must handle this volume without becoming the bottleneck.

What "Good" Looks Like

Capability	Why It Matters
Tamper-evident logging	Hash-chained entries with HMAC integrity. If someone modifies a log entry, the chain breaks. Auditors can verify integrity independently.
Full action capture	Not just "agent ran" but "agent called `crm_search_contacts` with query `{name: 'Acme'}`, returned 12 results, took 340ms, cost $0.002." Every tool call, every parameter, every result.
LLM pipeline audit	Log the full security pipeline: prompt injection detection (pass/fail), PII redaction (what was redacted), content moderation (score), output validation (pass/fail). Not just the final response.
Separate audit storage	Audit logs shouldn't compete with application data for database resources. Dedicated audit database with independent retention, backup, and access controls.
SIEM integration	Real-time streaming to Splunk, DataDog, Sentinel, etc. Your SOC shouldn't need to learn a new tool — agent events should appear alongside your existing security events.
Retention guarantees	7+ years for SOX. Configurable per regulation. Visible retention vs. stored retention strategy (upgrade value).

The LLM Gateway Audit Pipeline

Every LLM call — whether from an AI agent, a coding session, or a direct API request — should pass through a security gateway that audits each stage:

Budget check: Is the tenant within spending limits?
Prompt injection scan: Does the input contain adversarial patterns?
PII redaction: Detect and mask personal data before it reaches the LLM
Content moderation: Check input against safety policies
Audit entry: Record the sanitized input, model, and context
[LLM call]
Output validation: Check response for policy violations
Content moderation (output): Verify response safety
PII restoration: Re-insert redacted PII into response for the user
Token billing: Record usage and cost
Audit entry: Record the full pipeline result with cumulative hash

The pipeline hash — a SHA-256 computed cumulatively across all stages — provides an integrity proof that no stage was bypassed. If someone skips prompt injection detection to save latency, the hash chain breaks.

Implementation Checklist

Every agent action logged with actor, target, action, result, cost, and timestamp
Audit entries are integrity-protected (HMAC or hash chain)
LLM gateway pipeline logs each security stage, not just the final result
Audit storage is separable from application storage
SIEM export in standard formats (JSON, CEF)
Retention is configurable per regulatory framework
Inline audit trail visible in the dashboard (not buried in log files)

Pillar 5: Federation

The principle: When your agent talks to another organization's agent, trust must be verified cryptographically — not assumed.

Why Federation Matters

The value of AI agents multiplies when they can collaborate across organizational boundaries. Your sales agent negotiating with a supplier's procurement agent. Your compliance agent exchanging audit evidence with an auditor's agent. Your engineering agent requesting a code review from a partner's DevOps agent.

But cross-org collaboration introduces risks that don't exist within a single organization: identity fraud (is that really Acme Corp's agent?), data sovereignty violations (did our EU data just get processed in the US?), privilege escalation (did their agent gain access to our internal tools through the collaboration?), and accountability gaps (who is liable when a cross-org agent interaction goes wrong?).

The Trust Problem

Today, cross-org integrations are built on shared API keys, IP allowlists, and contractual trust ("we trust Acme because we signed an NDA"). This doesn't scale to agent-to-agent communication where interactions happen at machine speed without human review.

Federation requires automated trust verification — the equivalent of border control for the Internet of Agents. Every cross-org interaction should verify:

Identity: The remote agent is who it claims to be (SPIFFE trust bundle verification)
Authorization: The remote agent is allowed to make this request (delegation chain validation)
Data residency: Data shared in this interaction stays within agreed-upon boundaries
Audit: Both organizations have a complete record of what was exchanged

The Emerging Standards (as of March 2026)

Standard	Governance	What It Does	Adoption
MCP	Agentic AI Foundation (Linux Foundation). Donated by Anthropic Dec 2025. Co-founded with Block and OpenAI.	Tool integration — how agents use tools	10,000+ public servers. Fortune 500 deployments.
A2A	Linux Foundation A2A Project. Initially launched by Google Apr 2025.	Agent-to-Agent — how agents collaborate. Agent Cards, task delegation, SSE streaming.	150+ organizations including Microsoft, SAP, Adobe. v0.3 stable.
SLIM	AGNTCY / Linux Foundation. Formative members: Cisco, Dell, Google Cloud, Oracle, Red Hat.	Cross-org messaging with identity verification. gRPC transport. Post-quantum crypto roadmap.	Production at Swisscom, SRE automation. 75+ companies. IETF draft submitted.
OASF	AGNTCY	Open Agent Schema Framework — agent description schema, skill taxonomy, decentralized directory.	Part of AGNTCY stack. Schema finalized.
AI Card	Linux Foundation	Unified metadata format across protocols.	Draft specification.

The standards landscape consolidated rapidly in late 2025 and early 2026. Both MCP and A2A moved to Linux Foundation governance. AGNTCY joined with Cisco, Dell, Google Cloud, Oracle, and Red Hat as formative members. The fragmentation risk that worried enterprises a year ago is resolving — the remaining question isn't which standard but how fast your organization adopts them.

Federation Security Architecture

Trust establishment

Before two organizations' agents can communicate, a trust relationship is established. Each organization exchanges its SPIFFE trust bundle — the public keys needed to verify the other's agent identities. Trust is explicit, revocable, and audited.

Message routing

Cross-org messages flow through a federation bridge that verifies identity, checks trust status, encrypts the payload (AES-256-GCM), and logs the interaction. If trust is revoked mid-session, the bridge terminates the connection immediately.

Circuit breaker

If a remote organization's endpoint becomes unreliable (3 failures within 5 minutes), the circuit breaker opens and blocks further requests. This prevents cascading failures and gives the remote organization time to recover. The circuit resets automatically after the cooldown period.

Implementation Checklist

Trust relationships are explicit and revocable (not implicit from network access)
Remote agent identity verified via trust bundle (not API key)
All cross-org messages encrypted in transit
Federation bridge with circuit breaker prevents cascading failures
Both organizations maintain complete audit records of cross-org interactions
Data residency constraints enforced at the federation boundary
Built on open standards (SLIM, A2A) — no proprietary lock-in

How the Pillars Compose

The five pillars aren't independent layers — they form a reinforcing system:

Interaction	What Happens
Identity → Authorization	Authorization checks reference the agent's cryptographic identity, not a shared key. Policies are bound to specific agents.
Authorization → Audit	Every authorization decision (grant or deny) is recorded in the audit trail with the full policy context.
Audit → Verification	Verification samples from audit data to check cross-step compliance. The audit trail IS the verification input.
Verification → Identity	Execution certificates are signed with the platform's identity key. Verification status becomes part of the agent's credential.
Federation → All	Cross-org interactions extend identity verification, delegation authorization, federated audit correlation, and cross-org compliance attestation.

The composability test

If you remove any single pillar, the others degrade. Authorization without identity can't attribute decisions. Audit without authorization has no policy context. Verification without audit has no data to verify. Federation without identity can't verify trust. This is why point solutions (just audit, just authorization) don't achieve governance — you need the integrated system.

Chapter Summary

The Five Pillars — Identity, Authorization, Verification, Audit, and Federation — are the architectural foundations of AI governance at Level 4+. Each pillar addresses a specific governance dimension that enterprise IAM, SIEM, and compliance tools don't cover for AI agents. The pillars compose into a reinforcing system where each one strengthens the others.

The next chapter maps these pillars to specific regulatory requirements — EU AI Act, GDPR, HIPAA, SOX, DORA, and NIS2 — with control mapping tables your CISO can hand directly to an auditor.

Chapter 04The Regulatory Landscape

AI agents aren't exempt from existing regulation. They're a new class of actor that triggers existing requirements — often requirements that were designed for humans or traditional software. This chapter maps each major regulatory framework to the Five Pillars, with specific articles, agent-relevant obligations, and the governance controls that satisfy them.

How to use this chapter

If you're a CISO: Print the control mapping table for your relevant frameworks. Hand it to your auditor during the next assessment. It maps each regulatory obligation to a specific, implementable governance control.

If you're a CIO: Use the summary table at the end to scope your governance program. Not every framework applies to every organization — but the ones that do apply are non-negotiable.

EU AI Act

Regulation (EU) 2024/1689

Full enforcement: 2 August 2026 (5 months away)

The EU AI Act is the world's first comprehensive AI regulation. It entered into force on 1 August 2024. Prohibited practices and AI literacy obligations applied from 2 February 2025. High-risk AI system rules become fully applicable on 2 August 2026. Compliance experts estimate 32-56 weeks to achieve compliance — if you haven't started, you are already behind the curve.

Most enterprise AI agent deployments in regulated industries trigger high-risk classification under Article 6 and Annex III — particularly agents involved in employment decisions, credit scoring, critical infrastructure, or law enforcement.

Article	Requirement	Agent-Specific Obligation	Pillar	Control	Level
Art. 9	Risk management system	Continuous risk assessment for AI agent operations. Identify and mitigate risks throughout agent lifecycle.	Verification	Policy-as-Code analysis detects contradictions, privilege escalation paths, fail-open gaps at deploy time. Runtime verification checks cross-step compliance.	Automated
Art. 10	Data governance	Training data quality. Agents must not perpetuate bias or use inappropriate data.	Audit	LLM Gateway pipeline: PII redaction before model, content moderation on output. Data flow logged in audit trail.	Automated
Art. 12	Record-keeping	Automatic logging of agent actions with sufficient detail for post-incident analysis.	Audit	Hash-chained audit logs. Every tool call logged with actor, target, action, result, cost. HMAC integrity. 7+ year retention. SIEM export.	Automated
Art. 13	Transparency	Users must know they're interacting with AI. Agent capabilities and limitations must be documented.	Identity	Agent identity visible in all interactions. Verifiable Credentials carry capability declarations. System prompts visible (no black box).	Semi-auto
Art. 14	Human oversight	Humans must be able to monitor, interpret, and override AI agent actions. Prevent over-reliance.	Authorization	Kill switch hierarchy (agent → team → tenant). Approval workflows for sensitive operations. Human-in-the-loop enforcement. Progressive autonomy levels (reactive → proactive → autonomous).	Automated
Art. 15	Accuracy, robustness, cybersecurity	AI systems must be resilient to adversarial attacks. Output must be accurate and reproducible.	Verification	Multi-LLM cross-checking (PVP). Prompt injection detection in Gateway. Output validation. Content moderation.	Automated
Art. 99	Penalties	Prohibited practices: up to €40M or 7% of global turnover. High-risk non-compliance: up to €20M or 4%. Misinformation to authorities: up to €10M or 1%.

GDPR

Regulation (EU) 2016/679

In force since 25 May 2018. Applies to all AI systems processing EU personal data.

GDPR doesn't mention AI agents specifically — but every agent that processes personal data of EU residents is subject to it. The key challenge: agents process data at machine speed across tools, making traditional consent and purpose limitation controls insufficient without automation.

Article	Requirement	Agent-Specific Obligation	Pillar	Control	Level
Art. 5(1)(b)	Purpose limitation	Agent must only access data for the purpose it was collected. Cross-purpose usage by agents must be prevented.	Authorization	Per-tool authorization policies restrict which data each agent can access. Scope bound to workspace/team. Default deny.	Automated
Art. 5(1)(c)	Data minimization	Agent should access only the minimum data necessary for its task.	Authorization	Fine-grained tool policies. Agent authorized for "CRM read contacts" but not "CRM read all." Resource-level scoping.	Automated
Art. 5(1)(f)	Integrity and confidentiality	Personal data processed by agents must be protected against unauthorized access and accidental loss.	Audit + Identity	Envelope encryption (AES-256-GCM). BYOK mandatory. Per-agent cryptographic identity. TLS 1.3 in transit. BYOS for data residency.	Automated
Art. 22	Automated decision-making	Data subjects have the right not to be subject to automated decisions with legal effects. Agents making such decisions need human review.	Authorization	Approval workflows require human sign-off for high-stakes agent actions. Four-eyes principle enforcement via governance packs.	Semi-auto
Art. 25	Data protection by design	Agent platform must implement privacy controls as architectural defaults, not afterthoughts.	All	PII redaction in LLM Gateway (before data reaches LLM). Encryption auto-triggered by GDPR governance pack. DLP scanning on tool inputs/outputs.	Automated
Art. 30	Records of processing	Maintain records of all processing activities by AI agents.	Audit	Comprehensive audit trail. Every tool call = a processing activity record. Searchable, exportable, SIEM-integrated.	Automated
Art. 32	Security of processing	Appropriate technical measures: encryption, access control, regular testing.	Identity + Authorization	Cryptographic agent identity. OpenFGA authorization. Envelope encryption. Key rotation. Access reviews.	Automated
Art. 33	Breach notification (72h)	Detect and report breaches involving agent-processed data within 72 hours.	Audit	SIEM real-time export. Kill switch for immediate containment. Incident management with SLAs. Audit hash chain detects tampering.	Semi-auto
Art. 35	Data Protection Impact Assessment	DPIA required for automated processing at scale.	Verification	Compliance reports auto-generated. Evidence auto-collected. GDPR governance pack produces framework-specific assessment data.	Semi-auto

HIPAA

45 CFR Parts 160, 164

Proposed Security Rule amendments (Jan 2025) make previously optional safeguards mandatory by 2026.

Any AI agent that accesses, processes, or transmits Protected Health Information (PHI) is subject to HIPAA. The proposed 2025 Security Rule amendments are strengthening requirements around encryption, audit logging, and access controls — with specific attention to AI systems. By 2026, healthcare organizations must maintain a detailed inventory of AI tools and comprehensive audit logs for any AI interactions involving PHI.

Section	Safeguard	Agent Obligation	Pillar	Control	Level
§164.308(a)(1)	Security management process	Risk analysis and management for all AI agent systems accessing PHI.	Verification	Policy analysis at deploy time. Continuous compliance monitoring via governance packs. Risk scoring.	Semi-auto
§164.308(a)(3)	Workforce security	Ensure only authorized agents access PHI. Terminate access when no longer needed.	Authorization	Per-agent authorization. Lifecycle management (create, deploy, pause, retire). Access reviews. Revocation is instant.	Automated
§164.308(a)(4)	Information access management	Policies for granting agent access to PHI. Minimum necessary standard.	Authorization	Fine-grained tool policies. "Agent X can read patient records but not write." Resource-level scoping. Default deny.	Automated
§164.312(a)(1)	Access control (Technical)	Unique agent identification. Emergency access procedures. Automatic session timeout.	Identity	Per-agent SPIFFE IDs. JWT-SVIDs with 1-hour TTL. Session management. Kill switch for emergency access revocation.	Automated
§164.312(b)	Audit controls	Record and examine all agent activity involving PHI.	Audit	Every tool call logged. LLM Gateway pipeline audit. Hash-chained integrity. Separate audit database support.	Automated
§164.312(c)(1)	Integrity	Protect PHI from improper alteration or destruction by agents.	Audit + Authorization	HMAC integrity on audit logs. Write authorization required (read-only by default). Content validation in Gateway.	Automated
§164.312(d)	Person or entity authentication	Verify that an agent is who it claims to be before granting PHI access.	Identity	Cryptographic identity verification. SPIFFE trust bundles for cross-org. OAuth 2.0 Token Exchange for delegation.	Automated
§164.312(e)(1)	Transmission security	Encrypt PHI in transit when processed by agents.	Federation	TLS 1.3 for all API traffic. AES-256-GCM for cross-org federation. Envelope encryption for data at rest.	Automated

SOX

Sarbanes-Oxley Act, Section 404

Applies to all publicly traded companies. Continuous compliance required.

SOX Section 404 requires management to assess the effectiveness of internal controls over financial reporting. When AI agents are involved in financial processes — invoice processing, reconciliation, expense approval, financial analysis — they become part of the internal control environment. Auditors need to verify that the agent's control trail is as auditable as a human's.

Control	Requirement	Agent Obligation	Pillar	Control	Level
COSO: Control Environment	Tone at the top; ethical values	Agent behavior governed by explicit policies, not implicit LLM "values."	Authorization	Cascading governance policies (Platform → Tenant → App → Team → Agent). Policies are code, not documents.	Automated
COSO: Risk Assessment	Identify and manage risks to financial reporting	Risk assessment for agent actions that affect financial data.	Verification	Policy analysis detects risks at deploy time. Budget constraints enforced per-agent. Cost tracking per-call.	Automated
Separation of Duties	No single person/system controls all aspects of a financial transaction	Agent that proposes a payment must not be the same agent that approves it.	Verification + Authorization	Cross-step verification detects SoD violations. Four-eyes governance module enforces dual approval. Execution certificates prove compliance.	Automated
Audit Trail	Complete, immutable record of financial transactions	Every agent action on financial data must be logged with full context.	Audit	Hash-chained, HMAC-verified audit logs. Pipeline hash proves no stage was bypassed. 7+ year retention. Separate audit DB.	Automated
Access Controls	Restrict access to financial systems	Agents accessing financial tools must have explicit, scoped authorization.	Authorization	Per-tool policies for financial tools (e.g., "invoice_approve" requires approval workflow). SOX governance pack auto-applies rules.	Automated

DORA

Regulation (EU) 2022/2554

Applied from 17 January 2025. Affects all EU financial entities.

DORA (Digital Operational Resilience Act) requires financial entities to ensure their ICT systems — including AI agents — are resilient, recoverable, and continuously monitored. AI agents that participate in financial operations are ICT services subject to DORA's full scope.

Article	Requirement	Agent Obligation	Pillar	Control	Level
Art. 5-6	ICT risk management framework	Identify, classify, and manage risks from AI agent operations.	Verification	Policy analysis at deploy time. Governance cascade ensures consistent risk management from org to agent level.	Semi-auto
Art. 9	Protection and prevention	Protect ICT systems from AI agent misuse or compromise.	Authorization	Default-deny authorization. Prompt injection detection. Content moderation. PII redaction. Rate limiting.	Automated
Art. 10	Detection	Detect anomalous agent behavior and security incidents.	Audit	Real-time SIEM export. Anomaly detection via audit log analysis. Kill switch triggers on threshold breaches.	Automated
Art. 11	Response and recovery	Rapid containment and recovery from AI agent incidents.	Authorization	Kill switch hierarchy (seconds, not hours). Agent pause/suspend/emergency_stop. Incident management with SLAs (15min P1).	Automated
Art. 28-30	Third-party ICT risk	Manage risks from LLM providers, tool services, and federated agents.	Federation	BYOK mandatory (your keys, not vendor's). Circuit breaker on external services. Trust relationships are explicit and revocable. Health monitoring.	Semi-auto

NIS2

Directive (EU) 2022/2555

Member state transposition deadline: 17 October 2024. Enforcement ongoing.

NIS2 applies to essential and important entities across 18 sectors. AI agents operating within these entities' infrastructure are subject to NIS2's cybersecurity risk management requirements. The directive emphasizes supply chain security — relevant when agents use external LLM APIs or federate with other organizations' agents.

Article	Requirement	Agent Obligation	Pillar	Control	Level
Art. 21(2)(a)	Risk analysis and security policies	Security policies must cover AI agent operations.	All	Governance packs codify security policies per framework. Cascading policies enforce at every level.	Automated
Art. 21(2)(b)	Incident handling	Detect, respond to, and recover from AI agent security incidents.	Audit + Authorization	Kill switch for containment. SIEM export for detection. Incident management workflows. Hash-chained evidence.	Automated
Art. 21(2)(d)	Supply chain security	Manage risks from LLM providers, tool integrations, and federated agents.	Federation	BYOK (own keys). BYOS (own storage). Trust bundle verification for federation. Circuit breaker health monitoring. Vendor independence (6 LLM providers).	Semi-auto
Art. 21(2)(i)	Human resources security	Access control policies for AI agents alongside human workforce.	Identity + Authorization	Agents as first-class identities in IAM. SPIFFE IDs. Access reviews include agents. SCIM provisioning for user lifecycle.	Automated
Art. 23	Reporting obligations	Report significant incidents within 24h (early warning) / 72h (full notification).	Audit	Real-time SIEM export enables immediate detection. Compliance reports auto-generated. Incident timeline reconstructable from audit trail.	Semi-auto

Cross-Framework Summary

The table below maps each Pillar to its regulatory justification across all six frameworks. Use this to prioritize: if a Pillar is required by every framework your organization is subject to, it's non-negotiable.

Pillar	EU AI Act	GDPR	HIPAA	SOX	DORA	NIS2
Identity	Art. 13 (transparency)	Art. 32 (security)	§164.312(a)(1), (d)	Access controls	Art. 9	Art. 21(2)(i)
Authorization	Art. 14 (oversight)	Art. 5, 22, 25	§164.308(a)(3-4)	SoD, access controls	Art. 9, 11	Art. 21(2)(a)
Verification	Art. 9, 15	Art. 35 (DPIA)	§164.308(a)(1)	Risk assessment, SoD	Art. 5-6	Art. 21(2)(a)
Audit	Art. 10, 12	Art. 30, 33	§164.312(b), (c)(1)	Audit trail	Art. 10	Art. 21(2)(b), 23
Federation	Art. 15 (cybersecurity)	Art. 5(1)(f)	§164.312(e)(1)	—	Art. 28-30	Art. 21(2)(d)

Key Governance Modules by Framework

The governance platform implements 20 modular controls. The diagram below shows which modules satisfy which regulatory framework — allowing you to activate only the modules your regulations require.

Chapter Summary

Every major regulatory framework — whether designed for AI (EU AI Act), for data protection (GDPR), for healthcare (HIPAA), for financial controls (SOX), for operational resilience (DORA), or for cybersecurity (NIS2) — requires the same architectural capabilities from AI agent deployments: identity, authorization, verification, audit, and federation.

The Five Pillars aren't an abstract framework — they're the minimum viable governance architecture to satisfy regulatory requirements across regulated industries. The control mapping tables in this chapter provide the specific, article-by-article evidence your auditor needs.

The next chapter presents the Reference Architecture — how these controls are implemented as a technical system, from the LLM Gateway pipeline to cascading governance to envelope encryption.

Chapter 05The Reference Architecture

The previous chapters defined what governed AI agent deployment requires. This chapter defines how — the technical architecture that makes Level 4 governance possible without sacrificing the speed and flexibility that makes AI agents valuable in the first place.

The architecture is organized into four layers, each addressing a distinct concern:

Layer	Concern	Components
Execution Layer	How agents run and interact with tools	Agent executor, MCP tool registry, sandbox providers
Security Layer	How every LLM call is secured	LLM Gateway pipeline (10-stage)
Governance Layer	How policy flows through the hierarchy	Cascading policy resolver, governance packs, autonomy levels
Data Layer	How data is stored, encrypted, and located	Envelope encryption, BYOS, BYOK, memory pointers

The LLM Gateway Pipeline

Every LLM call — whether from an AI agent executing a task, a coding session generating code, or a direct user request — passes through a 10-stage security pipeline. The gateway is not optional or configurable — it is the only path to the LLM.

Stage details

#	Stage	What It Does	On Failure
1	Budget Check	Verifies tenant hasn't exceeded spending limits. Per-call, per-agent, and per-tenant budgets.	Request blocked with 429. Agent receives budget error.
2	Prompt Injection Detection	5 structural patterns: fake system headers, document boundary markers, HR-separator overrides, XML section tags, JSON role injection. Plus semantic analysis.	Request blocked. Logged as security event. Agent receives sanitized error.
3	PII Redaction	Detects and masks personal data (names, emails, SSNs, credit cards, phone numbers) before content reaches the LLM. 16+ pattern categories.	PII replaced with tokens. Original values stored for restoration.
4	Content Moderation (input)	Checks input against safety policies. Configurable thresholds per governance pack (HIPAA = stricter).	Request blocked or flagged depending on enforcement mode.
5	Audit (pre-call)	Records sanitized input, model, context, and pipeline state. Pipeline hash begins.	Always succeeds (non-blocking).
6	LLM Call	Routed to appropriate provider (6 supported: Anthropic, OpenAI, Google, Mistral, Groq, vLLM self-hosted). BYOK keys used.	Provider circuit breaker. Fallback to alternate model if configured.
7	Output Validation	Checks response for policy violations, credential leaks, and formatting requirements.	Response sanitized. Violations logged.
8	Content Moderation (output)	Verifies response safety. Catches harmful content the LLM might generate.	Response blocked or redacted.
9	PII Restore	Re-inserts original PII values into response for the authorized user. LLM never saw the real PII.	Tokens left unreplaced (safe degradation).
10	Token Billing + Audit (post-call)	Records usage, cost, latency. Completes pipeline hash. HMAC-signed audit entry.	Always succeeds (non-blocking).

The pipeline hash is the integrity proof

A SHA-256 hash is computed cumulatively across all 10 stages. The final hash is stored in the audit log alongside the HMAC signature. If any stage is skipped (e.g., someone disables prompt injection detection for "performance"), the hash chain breaks — and the discrepancy is detectable during audit review. You can prove that every security stage ran.

Cascading Governance

Policy doesn't live in one place. It cascades through a 4-level hierarchy, with the most restrictive level winning at any point of conflict:

What cascades

Policy Dimension	What It Controls	Example
Autonomy level	How independently agents can act	`proactive` (propose + human approves) vs `autonomous` (act without approval) vs `reactive` (only when asked)
Artifact governance	Who can create/modify schedules, workflows, triggers, prompts	`autonomous` (agents create freely) vs `proactive` (propose, human approves) vs `locked` (manifest-only)
Discovery capabilities	What agents can discover on their own	5 toggles: tool discovery, data discovery, memory creation, skill suggestion, auto tool sync
Enforcement mode	How authorization denials are handled	`audit` → `warn` → `enforce`
Governance packs	Which compliance modules are active	GDPR pack enables PII redaction, consent tracking, data export. HIPAA pack enables PHI detection, encryption, access controls.

The cascade is resolved at request time with a 60-second TTL cache. Policy changes propagate within one minute. No restart required.

Envelope Encryption

Content encryption at rest uses a 3-level envelope encryption architecture. This is the same pattern used by AWS, GCP, and Azure for their managed encryption services.

What triggers encryption

Encryption is not a toggle — it's governance-driven. When a HIPAA or GDPR governance pack is enabled on a tenant, encryption auto-activates for:

Content Type	Encrypted	Rationale
Report content & summary	Yes	May contain PII/PHI from agent analysis
Agent memory content	Yes	May contain learned facts about individuals
Memory embeddings	No	Lossy vector projections. Can't reconstruct content. Preserves semantic search.
Context graph descriptions	Yes	Entity descriptions may reference people/companies
Context graph embeddings	No	Same rationale as memory embeddings
File parsed content	Yes	Uploaded documents may contain sensitive data

KMS provider options

Provider	Use Case	Key Storage
Platform-managed	Default. Keys derived from ENCRYPTION_KEY.	Platform infrastructure
AWS KMS	Enterprise. Customer-managed keys in AWS.	AWS Key Management Service
GCP Cloud KMS	Enterprise. Customer-managed keys in GCP.	Google Cloud KMS
Azure Key Vault	Enterprise. Customer-managed keys in Azure.	Azure Key Vault
Local (air-gapped)	On-premise. No external KMS dependency.	Software HSM with scrypt-derived per-tenant keys

Kill Switch Hierarchy

When something goes wrong, speed matters more than process. The kill switch provides three levels of emergency halt, each cascading downward:

The kill switch is implemented as a governance module — it inherits the cascading policy system. A tenant-level kill switch overrides any team or agent-level setting. This ensures that emergency containment is always possible, even if a team has misconfigured its own governance.

Data Residency (BYOS)

Enterprise customers can store data in their own infrastructure. The platform stores pointers — not the data itself.

BYOS is not just "store files in S3." It's an architectural pattern where the platform never holds the data. Memory content, reports, generated documents, and file attachments all flow through the customer's storage. The platform holds metadata: paths, content hashes, sync status, and encryption key references.

Multi-Provider LLM Routing

No vendor lock-in. The platform supports 6 LLM providers with automatic routing based on model name. BYOK (Bring Your Own Key) is mandatory at all tiers — customers provide their own API keys.

Provider	Models	Key Feature
Anthropic	Claude Opus 4.6, Sonnet 4.6, Haiku 4.5	Primary. Extended thinking. Best for complex reasoning.
OpenAI	GPT-4o, GPT-4.1, o3, o1	Broad model range. Content moderation API.
Google	Gemini 2.5 Pro, 2.5 Flash, 2.0 Flash	Large context windows. Multimodal.
Mistral	Mistral Large, Codestral	European provider. EU data residency.
Groq	Llama 4 Scout, Llama 3.3	Ultra-fast inference. Cost-effective for high-volume.
vLLM (self-hosted)	DeepSeek R1, Qwen 2.5, Qwen3-Coder	Full control. No data leaves your infrastructure.

Model selection is per-agent, per-team, or per-task. Model aliases (claude-sonnet-latest) resolve at deploy time. When a model is deprecated, the lifecycle manager re-routes agents automatically.

Agent Execution Model

Agents execute in a standard ReAct loop: the LLM thinks, calls a tool, observes the result, and decides what to do next. Every tool call in this loop passes through the authorization check and the LLM Gateway.

The execution loop has built-in safety limits: 100 tool calls per session (configurable), cost ceiling per session ($5.00 default), and loop detection (same tool called 5+ times with >80% argument similarity triggers circuit breaker).

Chapter Summary

The reference architecture implements the Five Pillars as running infrastructure: the LLM Gateway secures every model interaction with a 10-stage pipeline and cumulative hash integrity proof. Cascading governance resolves policy in real-time through a 4-level hierarchy. Envelope encryption protects content at rest with governance-triggered activation. The kill switch provides instant emergency containment at any hierarchy level. BYOS keeps data in the customer's infrastructure with pointer-only storage on the platform side.

The next chapter translates this architecture into an Implementation Playbook — a phased rollout plan with role-by-role guidance for CISOs, CIOs, platform teams, and business owners.

Chapter 06The Implementation Playbook

This chapter translates the architecture (Chapter 5) and regulatory requirements (Chapter 4) into an actionable implementation plan. It's designed for the program manager who needs to present a timeline to the steering committee, the CISO who needs to know when controls go live, the CIO who needs to report progress to the board, and the platform team that needs to know what to build and when.

Two paths, one destination

Platform-assisted path: Deploy a managed agent platform with built-in governance. Skip from Phase 0 to Phase 2 in days. Level 3 on day one, Level 4 within weeks. This is the path for organizations that want speed.

Build path: Assemble governance infrastructure from components. Expect 6-12 months and 3+ FTEs for Level 3, with Level 4 as a multi-quarter initiative. This is the path for organizations with unique constraints that no platform addresses.

Chapter 8 (Decision Framework) helps you choose between them.

PHASE 0

Assessment

Duration: 1-2 weeks | Maturity level: L1 → L1 (no change yet) | Deliverable: Governance readiness report

Before deploying anything, understand where you are. Phase 0 produces the baseline assessment that justifies the investment and scopes the project.

Activities

Shadow AI inventory: Survey employees. Check expense reports for AI subscriptions. Audit DNS and network logs for AI API traffic. The goal isn't to punish — it's to understand the actual state.
Maturity assessment: Use the Chapter 2 matrix or the interactive assessment tool. Score each dimension. Identify the weakest links.
Regulatory mapping: Which frameworks apply to your organization? Use Chapter 4 tables. Map current controls to requirements. Identify gaps.
Stakeholder alignment: Brief the CISO, CIO, and at least one business unit leader. The business unit provides the first use case. The CISO provides the security requirements. The CIO provides the platform decision authority.
Risk appetite definition: What enforcement mode will you start with? (Recommendation: warn for the first 2 weeks.) What's the maximum acceptable risk from a shadow AI incident during the transition?

CISO

Review shadow AI findings
Define security requirements
Set initial enforcement mode
Approve the governance scope

CIO

Sponsor the initiative
Allocate platform budget
Select platform path (buy vs build)
Designate platform team lead

Platform Team

Run shadow AI inventory
Evaluate platform options
Document current integrations
Plan SSO/SCIM integration

Business Owner

Identify first use case
Define success metrics
Designate pilot team
Commit to 2-week pilot

PHASE 1

Foundation

Duration: 1-2 weeks | Maturity level: L1 → L3 | Deliverable: Platform deployed with first team running

Deploy the governance platform and get the first team operational. On a managed platform, this is days, not months. The goal: every agent has an identity, every tool call is authorized, every action is audited.

Activities

Platform deployment: Connect SSO provider (SAML/OIDC). Configure SCIM for automated user provisioning. Set BYOK API keys for your LLM provider(s).
First workspace: Create a workspace for the pilot business unit. Configure the workspace briefing (context that flows to all teams).
First team: Deploy from a team blueprint (23 pre-designed options) or custom-build. Run the Team Starting Wizard: auto-discovery, charter generation, human approval, agent handshakes.
Connect integrations: Google Workspace, Microsoft 365, HubSpot, Salesforce, GitHub — whatever the pilot team needs. OAuth-based, scoped per team.
Enforcement mode: warn: Authorization checks run, warnings logged, nothing blocked yet. This builds confidence that governance doesn't break the workflow.

The "Day 1" checklist

At the end of Phase 1, you should be able to answer "yes" to all of these:

Every agent has a unique SPIFFE identity
Every tool call is logged with actor, target, action, result, and cost
You can kill any agent in seconds from the dashboard
The CISO can view the audit trail for any agent action
LLM calls go through the Gateway pipeline (PII redaction, prompt injection detection)
Total cost per agent, per team, per tenant is visible

PHASE 2

Governance Activation

Duration: 2-4 weeks | Maturity level: L3 → L4 | Deliverable: Enforcement mode active, compliance packs enabled

Phase 2 transitions from monitoring to enforcement. The 2-week warn period from Phase 1 has given you visibility into what agents actually do. Now you tighten controls based on evidence, not assumptions.

Activities

Review warn logs: Analyze authorization warnings from Phase 1. Fix legitimate access gaps (agents that need permissions they don't have). Identify actual policy violations vs. configuration errors.
Enable enforcement: Move from warn to enforce for the pilot team. Authorization denials now block the tool call. Monitor for false positives in the first 48 hours.
Activate governance packs: Enable the compliance packs for your regulatory requirements (GDPR, HIPAA, SOX, etc.). This auto-activates: PII detection, encryption at rest, specific approval workflows, audit retention policies.
Configure data protection: Enable BYOS if required (Enterprise). Configure encryption key management (platform-managed or customer KMS). Verify data residency compliance.
Compliance report: Generate the first compliance report (PDF). Review with the CISO. This becomes the baseline for future audits.

CISO

Review warn-period findings
Approve enforcement activation
Sign off on compliance report
Configure SIEM integration

Platform Team

Fix authorization gaps from warn logs
Enable governance packs
Configure encryption + BYOS
Set up compliance reporting

PHASE 3

Scale

Duration: Ongoing | Maturity level: L4 (maintained) | Deliverable: Multiple teams, multiple workspaces, operational governance

With governance proven on the pilot team, scale to additional business units. Each new team follows the same pattern: deploy from blueprint, run Starting Wizard, 2-week warn period, then enforce.

Activities

Onboard additional teams: Sales, Support, Marketing, Engineering, Compliance — each as a governed team with their own workspace, agents, and integrations.
Custom MCP servers: Register team-specific tools (internal APIs, databases, custom systems). Sandbox-test before production.
Policy refinement: As more teams onboard, governance policies get pressure-tested. Refine cascading policies at workspace and team level.
Access reviews: Run the first SOC 2-style access review campaign. Verify that agent permissions match current needs. Remove stale grants.
Continuous compliance: Schedule recurring compliance reports. Configure alert rules for policy violations. Integrate with your GRC tool.

PHASE 4

Federation

Duration: When ready | Maturity level: L4 → L5 | Deliverable: Cross-org agent collaboration

Federation is optional. Most organizations will reach Level 4 and operate there successfully for months before considering cross-org collaboration. When the ecosystem matures (SLIM, A2A, AGNTCY standards stabilize further), Phase 4 extends governance across organizational boundaries.

Activities

Establish trust: Exchange SPIFFE trust bundles with partner organizations. Define bilateral trust relationships with explicit scope and expiry.
Configure federation bridge: Enable cross-org message routing with encryption, audit, and circuit breaker protection.
Cross-org authorization: Define what partner agents can do in your environment via TBAC (Tool-Based Access Control). Scoped, time-limited, auditable delegation.
Federated audit: Ensure both organizations maintain complete records. Correlate cross-org events for incident investigation.

Timeline Summary

Phase	Duration	Maturity	Key Deliverable
Phase 0: Assessment	1-2 weeks	L1 → L1	Governance readiness report
Phase 1: Foundation	1-2 weeks	L1 → L3	First team running with identity, authz, audit
Phase 2: Governance	2-4 weeks	L3 → L4	Enforcement active, compliance packs, first report
Phase 3: Scale	Ongoing	L4	Multiple teams, access reviews, continuous compliance
Phase 4: Federation	When ready	L4 → L5	Cross-org collaboration with trust verification

Total time from zero to Level 4: 4-8 weeks (platform-assisted) or 6-18 months (build). The platform-assisted path is faster because the infrastructure exists — you're configuring and activating, not building.

Common Failure Modes

Governance initiatives fail for predictable reasons. Avoid these:

Starting with enforcement (bypassing warn period)

Agents break immediately. Teams lose trust in governance. The initiative gets shelved. Always start with warn mode and collect evidence before enforcing.

Trying to govern everything at once

Boil-the-ocean governance programs stall in committee. Start with one team, one use case, one regulatory framework. Expand once the pattern is proven.

Treating governance as a project, not a function

Governance is not a one-time implementation — it's an ongoing operational function. Access reviews, policy updates, compliance reports, and incident response are continuous. Budget for ongoing operations, not just initial deployment.

No business owner sponsorship

Governance imposed by IT without business buy-in creates friction and workarounds. The first pilot must be championed by a business unit leader who sees the value, not just the controls.

Chapter Summary

The implementation playbook follows five phases: Assessment (understand where you are), Foundation (deploy platform, first team), Governance Activation (enforcement, compliance packs), Scale (multiple teams, continuous compliance), and Federation (cross-org, when ready). The platform-assisted path gets from Level 1 to Level 4 in 4-8 weeks. The critical success factor is starting with warn mode, proving governance doesn't break production, and then tightening progressively based on evidence.

The next chapter maps the Standards Landscape — MCP, A2A, SLIM, OASF, and how they compose into the Internet of Agents.

Chapter 07The Standards Landscape

The agentic AI ecosystem consolidated rapidly in late 2025 and early 2026. Two of the five major protocols (MCP, A2A) moved to Linux Foundation governance. AGNTCY joined with Cisco, Dell, Google Cloud, Oracle, and Red Hat as formative members. The fragmentation that worried enterprises a year ago is resolving into a coherent — if still evolving — stack.

This chapter maps the landscape as of March 2026, explains how the protocols compose, and provides guidance on what to adopt now versus what to watch.

The Protocol Stack

The five protocols address different layers of the agent communication stack. They don't compete — they compose:

Layer	Protocol	Question It Answers	Governance
Tool Integration	MCP	How does an agent use a tool?	Agentic AI Foundation (Linux Foundation). Anthropic, Block, OpenAI.
Agent Collaboration	A2A	How do two agents work together on a task?	Linux Foundation A2A Project. Google, 150+ orgs.
Cross-Org Messaging	SLIM	How do agents talk across organizational boundaries, securely?	AGNTCY / Linux Foundation. Cisco, Dell, Google Cloud, Oracle, Red Hat.
Agent Description	OASF	How is an agent's identity, skills, and capabilities described?	AGNTCY. Open Agent Schema Framework.
Metadata	AI Card	How is agent metadata unified across protocols?	Linux Foundation. Draft specification.

MCP — Model Context Protocol

What it is: An open standard for connecting AI models to external tools, data sources, and services. MCP defines how an agent discovers tools, invokes them, and processes results. Think of it as "USB-C for AI" — a universal connector.

Where it stands (March 2026): Donated by Anthropic to the Agentic AI Foundation (AAIF, a Linux Foundation directed fund) in December 2025. Co-founded with Block and OpenAI. Over 10,000 active public MCP servers covering developer tools to Fortune 500 enterprise deployments.

2026 roadmap priorities: Enterprise-managed auth (SSO-integrated flows replacing static client secrets), gateway and proxy patterns with authorization propagation, formalized Working Groups with contributor ladder, and configuration portability across deployments.

What it means for governance: MCP defines the tool call interface — which means every tool invocation has a well-defined structure (tool name, input, output) that governance can intercept. Per-tool authorization, audit logging, and rate limiting all operate at the MCP tool call boundary. Without MCP, agents invoke tools through ad-hoc integrations that governance can't see.

Enterprise readiness

Adopt now. MCP is production-ready with broad ecosystem support. The 2026 enterprise auth roadmap will strengthen SSO integration. The tool boundary it defines is the natural enforcement point for authorization, audit, and cost tracking.

A2A — Agent-to-Agent Protocol

What it is: A communication protocol for AI agents to collaborate on tasks. Agents publish Agent Cards (JSON metadata at /.well-known/agent.json) describing their capabilities. Other agents discover these cards and delegate tasks.

Where it stands (March 2026): Launched by Google in April 2025. Transferred to Linux Foundation. Version 0.3 is stable and considered the first enterprise-grade release. 150+ organizations support A2A including Microsoft (Azure AI Foundry, Copilot Studio), SAP (Joule), and Adobe.

Key capabilities: Agent Cards for discovery, JSON-RPC for task management (submitted → working → completed/failed), SSE streaming for real-time updates, file/data part exchange, and push notification support.

What it means for governance: A2A defines how agents collaborate — task delegation, status updates, artifact exchange. Governance needs to audit these interactions: who delegated what to whom, what artifacts were exchanged, what was the outcome. A2A's structured task lifecycle makes this auditable by design.

Enterprise readiness

Adopt now. A2A v0.3 is stable. Microsoft and SAP adoption means your existing enterprise stack likely supports it already. Agent Cards are the natural extension of service catalogs into the agent world.

SLIM — Secure Low-Latency Interactive Messaging

What it is: A next-generation communication framework for secure, real-time messaging between AI agents across organizational boundaries. SLIM provides the transport layer with identity verification, encryption, and many-to-many interaction patterns.

Where it stands (March 2026): Part of AGNTCY, which joined the Linux Foundation with Cisco, Dell Technologies, Google Cloud, Oracle, and Red Hat as formative members. Over 75 companies contributing. IETF draft submitted (draft-mpsb-agntcy-slim-00). Production use cases at Swisscom (telecom), SRE automation tools (30% workflow automation), and voice AI applications.

Key capabilities: gRPC-based transport, many-to-many interaction patterns, voice/video support, real-time guarantees, SPIFFE-based identity verification, and a post-quantum cryptography roadmap.

What it means for governance: SLIM solves the hardest governance problem — cross-org trust. When your agent talks to a partner's agent, SLIM verifies identity via SPIFFE trust bundles, encrypts the channel, and provides structured audit points for both organizations. Without SLIM (or equivalent), cross-org agent collaboration requires manual trust establishment (phone calls, API key exchanges, NDAs).

Enterprise readiness

Plan for it. SLIM is production-ready for early adopters (Swisscom, SRE automation). For most enterprises, it becomes relevant when partners also support it. Build your identity infrastructure (SPIFFE) now so you're ready when federation demand arrives.

OASF — Open Agent Schema Framework

What it is: A standardized schema for describing AI agents — their identity, skills, capabilities, and interaction patterns. Part of the AGNTCY stack. Think of it as "a LinkedIn profile for AI agents" — machine-readable and verifiable.

Where it stands: Schema finalized. Decentralized Agent Directory operational. Integrated with AGNTCY's identity service. Used for agent discovery and capability matching in federated environments.

What it means for governance: OASF provides the metadata layer that enables governance at scale. When an agent publishes its capabilities via OASF, governance systems can: verify that the agent is authorized for those capabilities, match incoming requests to qualified agents, and track capability changes over time.

How They Compose

Scenario	Protocols Used	Flow
Agent uses a tool	MCP	Agent → MCP tool call → tool executes → result returned
Agent delegates task to another agent (same org)	A2A	Agent A → discovers Agent B via Agent Card → delegates task → receives result
Agent collaborates with agent in another org	A2A + SLIM	Agent A → discovers remote Agent B → SLIM establishes trust + encrypted channel → A2A task exchange
Agent registers in federated directory	OASF + SLIM	Agent publishes OASF description → registered in decentralized directory → discoverable by remote agents
Full enterprise scenario	All five	Agent uses tools (MCP), delegates to team agents (A2A), collaborates with partner (SLIM), described by (OASF), metadata unified by (AI Card)

What to Adopt Now vs. Later

Protocol	Recommendation	Rationale
MCP	Adopt now	Production-ready. 10K+ servers. Linux Foundation governance. The tool integration standard.
A2A	Adopt now	v0.3 stable. Microsoft/SAP/Adobe. Agent Cards are trivial to implement.
SLIM	Plan for it	Production at early adopters. IETF draft. Build SPIFFE identity now; federation when partners are ready.
OASF	Evaluate	Schema finalized but ecosystem is early. Useful for large organizations with many agents.
AI Card	Watch	Draft specification. Monitor Linux Foundation progress.

Chapter Summary

The agentic AI standards landscape has consolidated around five complementary protocols governed by the Linux Foundation and its directed funds. MCP and A2A are production-ready and should be adopted now. SLIM addresses cross-org trust and is ready for early adopters. OASF and AI Card are maturing. The key architectural decision: build on these open standards today, even while they evolve, to avoid proprietary lock-in and position your organization for the Internet of Agents.

The final chapter provides the Decision Framework — build vs. buy analysis, TCO comparison, and the 40 questions to ask any agent platform vendor.

Chapter 08The Decision Framework

You've assessed your maturity (Chapter 2), understood the architecture (Chapter 5), mapped the regulations (Chapter 4), and planned the rollout (Chapter 6). The remaining question: how do you get there?

Three options exist. Each has different cost, speed, and risk profiles. This chapter provides the framework to choose.

The Three Paths

	DIY / Framework	Managed Platform	Hyperscaler Native
What	Build governance on open frameworks (LangChain, CrewAI, AutoGen)	Deploy a purpose-built agent governance platform	Use cloud vendor's agent tools (Agentforce, Copilot Studio, Bedrock Agents)
Time to L3	6-12 months	1-2 weeks	2-4 weeks
Time to L4	12-24 months	4-8 weeks	Not available (L3 ceiling)
Team required	3-5 FTEs (ongoing)	0.5-1 FTE (config + ops)	1-2 FTEs
Governance depth	Whatever you build	Deep (built-in pillars)	Shallow (platform-level only)
Vendor lock-in	Framework lock-in	Low (open protocols)	High (cloud ecosystem)
Standards support	Manual integration	MCP + A2A + SLIM native	Vendor-specific + partial MCP
LLM flexibility	Full (you wire it)	Multi-provider (6+)	Vendor-preferred model
Best for	Unique constraints no platform addresses	Speed + governance depth	Already deep in one cloud

Total Cost of Ownership

The TCO comparison below assumes a mid-size enterprise deploying 50 AI agents across 5 teams for the first year.

Cost Category	DIY / Framework	Managed Platform	Hyperscaler Native
Platform license	$0 (open source)	$18K-$180K/yr	$50-650/user/mo
Engineering (build)	$300K-$600K (3-5 FTEs × 6-12mo)	$0 (pre-built)	$50K-$100K (integration)
Engineering (maintain)	$200K-$400K/yr (2-3 FTEs)	$50K-$100K/yr (0.5-1 FTE)	$100K-$200K/yr (1-2 FTEs)
LLM API costs	BYOK (your keys)	BYOK (your keys)	Vendor markup (1.2-3x)
Compliance gap	$100K-$500K (audit prep)	Included (governance packs)	$50K-$200K (partial coverage)
Time to value	6-12 months	2-4 weeks	4-8 weeks
Year 1 total	$600K-$1.5M	$68K-$280K	$200K-$700K

The hidden cost of DIY

The biggest cost isn't building the platform — it's maintaining it. Every new compliance framework, every protocol update, every security patch requires engineering time. When your lead governance engineer leaves, the knowledge goes with them. The platform vendor amortizes this cost across all customers. You don't.

40 Questions for Any Agent Platform Vendor

Whether you're evaluating a managed platform, a hyperscaler's native offering, or even a DIY approach, these questions reveal the real governance depth. Vendors that can't answer most of them have a governance gap.

Identity (Questions 1-8)

Does every agent have a unique, persistent identifier (not a session ID)?
Are agent identities cryptographically signed (e.g., SPIFFE, X.509)?
Can you verify an agent's identity without calling back to the issuer?
How are agent credentials rotated? What's the TTL?
Can an agent's identity be revoked instantly (seconds, not days)?
Does the platform issue Verifiable Credentials for agent capabilities?
How is cross-organization identity verification handled?
Are agent identities visible in the admin dashboard (not just API)?

Authorization (Questions 9-16)

Is authorization per-tool-call or per-application?
What's the authorization model? (RBAC, ABAC, ReBAC, Zanzibar/OpenFGA)
What's the default for unlisted tools — allow or deny?
How many tools are covered by authorization policies?
Can policies cascade from organization to individual agent?
Is there a kill switch? At what levels (agent, team, org)?
Does the platform support progressive enforcement (audit → warn → enforce)?
How are delegation chains (agent A delegates to agent B) controlled?

Verification (Questions 17-22)

Can the platform verify that agents followed policy — not just per-call, but across a full session?
Are there execution certificates (cryptographic proof of compliance)?
Can you write custom policies in code (not just natural language descriptions)?
Does deploy-time analysis detect policy contradictions and privilege escalation?
Is multi-model cross-checking available for high-stakes decisions?
Can an auditor independently verify an execution certificate?

Audit (Questions 23-30)

Is every tool call logged with actor, target, action, result, cost, and timestamp?
Are audit logs integrity-protected (HMAC, hash chain)?
Does the LLM call pass through a security pipeline (PII redaction, prompt injection, content moderation)?
Is each stage of the security pipeline logged (not just the final result)?
Can audit logs be exported to SIEM in real-time (JSON, CEF)?
Is the audit database separable from the application database?
What's the maximum audit log retention? (SOX requires 7 years)
Can you reconstruct the full sequence of an agent session from the audit trail?

Data & Compliance (Questions 31-36)

Is BYOK (Bring Your Own Key) mandatory or optional for LLM API keys?
Can customers store data in their own infrastructure (BYOS)?
Is content encryption at rest automatic or opt-in? What triggers it?
Which compliance frameworks are supported as pre-built governance packs?
Can compliance reports be generated automatically (PDF, not just JSON)?
How many LLM providers are supported? What happens when one is deprecated?

Architecture & Lock-in (Questions 37-40)

Which open protocols are supported (MCP, A2A, SLIM)?
Can you export all data and configuration if you leave the platform?
Is the pricing per-agent, per-user, per-call, or per-token?
What happens to running agents if the platform has an outage?

Risk Matrix

Risk	DIY	Managed Platform	Hyperscaler
Shadow AI persists	High (slow to deploy)	Low (fast deployment)	Medium
Compliance gap at audit	High (build it all)	Low (pre-built packs)	Medium
Data breach from agent	High (build security)	Low (Gateway pipeline)	Medium
Vendor lock-in	Low (your code)	Low (open protocols)	High
Key person dependency	High (custom code)	Low (vendor maintains)	Medium
Regulatory penalty	High (slow compliance)	Low (governance-first)	Medium
Innovation speed	Fast (custom)	Fast (platform + custom)	Slow (vendor roadmap)

The Business Case Template

Use this template when presenting to the CFO:

Problem

98% of organizations report unsanctioned AI use. Shadow AI breaches cost $4.63M on average. EU AI Act enforcement begins August 2, 2026, with penalties up to 7% of global turnover. We have [X] agents running without governance. Our maturity level is [L1/L2].

Solution

Deploy a governed agent platform that provides identity, authorization, audit, and compliance for all AI agents. Move from Level [current] to Level 4 in [4-8] weeks.

Cost

Platform: $[X]/year. Team: [0.5-1] FTE for configuration and operations. LLM costs: unchanged (BYOK). Versus DIY: $[600K-1.5M] year 1 + 3-5 FTEs ongoing.

Timeline

Phase 0 (assessment): 1-2 weeks. Phase 1 (first team): 1-2 weeks. Phase 2 (enforcement): 2-4 weeks. Total: governed AI operations in under 2 months.

Risk reduction

Eliminates shadow AI governance gap. Satisfies [GDPR/HIPAA/SOX/EU AI Act] requirements. Reduces breach risk premium ($670K per shadow AI incident). Kill switch provides instant containment.

Chapter Summary

Three paths exist for governed AI agent deployment: build (slow, expensive, full control), buy a managed platform (fast, cost-effective, deep governance), or use hyperscaler native tools (medium speed, ecosystem lock-in, shallow governance). The TCO gap is 5-10x between DIY and managed platform in year 1. The 40 vendor evaluation questions reveal real governance depth versus marketing claims. The business case centers on risk reduction, regulatory compliance, and speed to value.

What's Next

You've read the complete Agentic AI Blueprint — 8 chapters covering the shift, the maturity model, the Five Pillars, the regulatory landscape, the reference architecture, the implementation playbook, the standards landscape, and the decision framework.

Three actions from here:

Take the AI Governance Assessment — 25 questions, personalized maturity report, specific recommendations for your organization.
Download the full Blueprint PDF — All 8 chapters in one document. Share with your steering committee.
Book a governance briefing — 30-minute call with our team. Bring your CISO. We'll map the Blueprint to your specific regulatory requirements.

The organizations deploying governed AI today will define the next decade.

The ones still writing AI policies will be writing them for competitors' AI teams.

The Agentic AIBlueprint

Table of Contents

Chapter 01The Agentic Shift

From Chat to Action

Four Eras of Enterprise AI

The Shadow AI Crisis

What Shadow AI looks like in 2026

Sales team

Engineering team

Finance team

Why Policy Documents Fail

1. Policies are aspirational, not enforceable

2. Policies are static, agents are dynamic

3. Policies don't compose across organizations

The fundamental insight

The Governance Gap

What Happens to Organizations That Don't Adapt

Scenario A: The ban

Scenario B: The free-for-all

Scenario C: Governed deployment

This Blueprint is for Scenario C

The Regulatory Pressure

Chapter Summary

Chapter 02The Maturity Model

Why a Maturity Model?

The Five Levels

Level 1 — Ad-hoc

Where most enterprises are in 2026

Level 2 — Experimental

Level 3 — Managed

Level 4 — Governed

Level 4 is the target for regulated industries

Level 5 — Industrial

The Maturity Assessment Matrix

Take the full interactive assessment

The Path Forward

Common transition patterns

L1 → L3 in 2-4 weeks (platform-assisted)

L3 → L4 in 4-8 weeks (governance activation)

L4 → L5 when the ecosystem is ready

Chapter Summary

Chapter 03The Five Pillars

Pillar 1: Identity

Why Identity Matters

What "Good" Looks Like

What "Bad" Looks Like

Shared API keys

Service accounts without scoping

Implementation Checklist

Pillar 2: Authorization

Why Authorization Matters

The Three Enforcement Modes

The progressive activation pattern

What "Good" Looks Like

What "Bad" Looks Like

Binary access (admin or nothing)

Static configuration files

Implementation Checklist

Pillar 3: Verification

Why Verification Matters

Two Modes of Verification

Mode A: Policy Analysis (deploy-time)

Mode B: Execution Verification (runtime or post-hoc)

Verification is complementary to authorization

Execution Certificates

Implementation Checklist

Pillar 4: Audit

Why Audit Matters

What "Good" Looks Like

The LLM Gateway Audit Pipeline

Implementation Checklist

Pillar 5: Federation

Why Federation Matters

The Trust Problem

The Emerging Standards (as of March 2026)

Federation Security Architecture

Trust establishment

Message routing

Circuit breaker

Implementation Checklist

The Agentic AI
Blueprint