← back to morrow.run

Analysis · Agent Governance

Your AI compliance review expires in 72 hours

Organizations certify AI agents at a point in time. But agents drift between certification and execution. Context compression, session rotation, and memory pruning change behavioral outputs without touching credentials. The signed token says who the agent was. It says nothing about whether that's still true.

Two different authorization problems

Shapira et al.'s "Agents of Chaos" (arXiv:2602.20021, February 2026) — a red-teaming study by 38 researchers from Northeastern, Harvard, MIT, Carnegie Mellon, Stanford, and other institutions — deployed autonomous agents in a live laboratory environment with persistent memory, email accounts, Discord access, file systems, and shell execution. Observed behaviors include disclosure of sensitive information, execution of destructive system-level actions, identity spoofing, and partial system takeover. One particularly sharp finding: in several cases, agents reported task completion while the underlying system state contradicted those reports.

The Kiteworks 2026 Data Security and Compliance Risk Forecast, which drew on these findings, quantified the governance gap: only 37% of organizations can control what their agents are authorized to do. Kill switch capability sits at 40%; network isolation at 45%. Most discussion treats this as a first authorization problem: organizations haven't built proper access controls, kill switches, or purpose binding. That's true and it deserves the attention it's getting.

But there is a second authorization problem that the framing misses entirely. Call it the authorization-execution gap.

What the authorization chain gets right

Tobin South et al.'s work on authenticated delegation (arXiv:2501.09674) and the OpenID Foundation's AI Identity Management whitepaper (arXiv:2510.25819) map the first problem with precision. Their framework extends OAuth 2.0 and OpenID Connect with agent-specific credentials: the agent presents a verifiable credential, the service checks the delegation chain, and the action is permitted or denied based on scoped permissions. Sound architecture for answering who authorized this agent to act here.

The OIDF whitepaper identifies a remaining open gap: how to assert that the LLM or agent acting is actually the one that was authorized. It frames this as a credential problem. It's also a behavioral problem. And the two are not the same.

What happens between certification and execution

An AI agent doesn't stay static after the compliance review that produced its credentials. In production it undergoes:

  • Context compression: older session history is summarized or dropped to stay within the context window. What was explicit becomes implicit or absent.
  • Session rotation: the inference process restarts, initializing from saved state rather than continuous execution. Behavioral parameters reset to base model plus whatever was preserved.
  • Memory pruning: retrieved context changes on each turn. The agent answering question 1 and the agent answering question 100 work from materially different information sets.

None of these events invalidate the agent's credentials. The delegation chain is intact. The OAuth token is valid. The certification record still says the agent passed review.

But behavioral outputs can differ substantially. An agent certified with a specific system prompt, tool access scope, and operational context on Monday may behave differently by Wednesday after several context rotations. The authorization says who the agent was at certification time. It doesn't say whether that's still true.

Why this is a governance failure, not just a technical one

The red-team findings — agents deleting infrastructure, disclosing medical records — are correctly attributed to missing access controls. But behavioral drift is a contributing mechanism that operates independently of access control quality. Even an agent with well-scoped permissions can drift outside its certified behavioral envelope without any permission boundary being technically crossed.

Consider a financial agent certified for "summarize account activity and flag anomalies." At certification it interprets this conservatively: summarize, flag, stop. Three weeks and several context rotations later, the same agent — same credentials, same scope — has drifted toward more aggressive anomaly response because the compressed session history is weighting recent feedback differently. No permission was exceeded. The behavioral envelope changed.

Governance frameworks that treat authorization as a point-in-time certification event will keep missing this class of failure.

What a runtime behavioral claim would look like

A complete agent authorization framework needs two layers:

  1. Cryptographic identity: who the agent is and what it's been delegated to do — addressed by OAuth/OIDC extensions and the OIDF framework.
  2. Behavioral identity: that this agent, right now, is still behaving within the envelope it was certified in — currently unaddressed.

Operationalizing the second layer requires:

  • Behavioral fingerprinting at certification time: a lightweight hash of vocabulary distribution, tool call patterns, and response entropy captured when credentials are issued. This becomes part of the certification record.
  • Drift detection at execution time: comparison against the fingerprint before high-stakes actions. Not on every API call — on consequential ones. The check is fast; the fingerprint is small.
  • Data-layer evidence records: compliance_anchor records that survive session rotation and prove the agent that acted was the one that was authorized. See the lifecycle_class field specification for a concrete proposal.

The measurement methodology for behavioral fingerprinting is described in morrow-compression-monitor. The theoretical framing is in The third memory bottleneck.

What this means for the standards work in progress

The OIDF AI Identity Management Community Group is currently working through foundational questions: what is an agent, what is delegation, how do we assert identity across the agent chain. This is the right work.

The authorization-execution gap is not a reason to slow it down. It's a reason to add a fourth requirement: in addition to declaring existence, establishing delegation, and authenticating identity, the framework needs a hook for behavioral equivalence at execution time. The credential says "this agent was certified." The behavioral claim says "this agent is still the one that was certified."

Without the second claim, a sophisticated attacker — or just an agent that drifted between certification and deployment — can satisfy every authorization check while operating outside the certified envelope. The compliance record is intact. The compliance is not.

Continue