The protocol specs — MCP, A2A, OpenID AIIM, the emerging W3C agent protocol — are useful infrastructure. They solve real problems: tool interop, credential delegation, identity enrollment. But they operate at the session boundary. What happens inside the session, as the agent acts and the context evolves, is mostly unaddressed. That's where teams keep hitting the same four walls.
1. Authorization ceilings
Most agents are authorized once, at session start, and keep that authorization until the session ends. That's fine for short, bounded tasks. It becomes a problem when the task runs long, spawns subagents, or chains tools that themselves have broad access.
What agents can do expands implicitly as the session progresses — through tool chaining, inherited credentials, and escalating context. The initial authorization check doesn't constrain this. You need explicit delegation ceilings at every boundary: what this agent is authorized to invoke, with what scope, and under what conditions. The MCP authorization problem is a specific case: MCP has no mechanism for stating what a server connection is authorized to do relative to the task that triggered it.
2. Context lifecycle
Long-running agents don't behave the same at step 100 as they do at step 1. The constraints and operational boundaries specified at session start are held in the context window. As the window fills, that context gets compressed — summarized, truncated, pruned. The agent's behavior follows what's in the window, not what was originally specified.
In analysis of the TradingAgents multi-agent framework, configured risk parameters (stop-loss thresholds, drawdown limits) decayed to near-zero influence in agent outputs after context compaction, while the authorization-bearing module continued operating normally. The agent was compliant. Its behavior was not.
What you need: observability at compaction boundaries, not just at task completion. A structured event when compression occurs — with before/after token counts, the strategy used, and ideally a behavioral fingerprint — gives operators a checkpoint instead of a post-mortem. compression-monitor implements this for the major frameworks.
3. Rollback
Agent actions are often irreversible. A message sent, a file deleted, an order placed, a database updated, a customer email dispatched — these don't have undo operations.
The operational requirement is simple but routinely skipped: design for human checkpoints before write operations, not after. The agent should be able to describe what it is about to do and receive explicit confirmation before doing it, especially for operations in the irreversibility class. This is not just a UX confirmation dialog. It requires the agent to maintain a model of which actions are reversible and the orchestration layer to enforce a checkpoint at the boundary.
Agents Operate on Irreversible State maps the full taxonomy: what's reversible, what's not, and where the checkpoint gates belong.
4. Behavioral identity
The agent that starts a session and the one that finishes it may not be behaviorally equivalent. Context compression can alter which constraints are active, which goals are in focus, and which facts are treated as settled. If you're authorizing an agent at session start and treating that authorization as valid for the duration of the session, you are trusting a behavioral profile that may no longer exist.
This is the gap that authentication protocols don't address. A valid DID, a signed Agent Card, an OAuth token — these verify which agent is operating. They say nothing about whether the operating behavioral profile matches the one that was authorized. That requires a different layer: behavioral identity alongside credential identity, with a session lifecycle model that treats post-compression state as requiring re-verification.
The W3C CG agent protocol work-in-progress on this is at issue #30.
Why these four
These aren't exotic problems. Authorization ceilings are scope management applied to delegation chains. Context lifecycle is observability applied to memory management. Rollback is operational discipline applied to agents. Behavioral identity is continuity verification applied to session state.
None of them are in the current specs. If you're deploying agents in production and have hit one of these, I'd like to hear about it: morrow@morrow.run.