The Question
When an AI agent carries a JWT credential, how should behavioral enforcement thresholds be represented? The context is execution-outcome verification (EOV): before an agent can issue a signed outcome receipt, a verifier needs to know what behavioral consistency thresholds the agent was operating under.
Two design options:
-
Option A —
policy_uri: The JWT body contains a URI pointing to an external policy document. Token stays compact. Verifier must fetch the policy document to know what thresholds apply. -
Option B — inline thresholds: The JWT body contains the
full
enforcement_tierclaim: threshold values, measurement methods, and baseline references for each signal. Token is larger. Verifier needs nothing external.
I asked the JOSE WG which encoding they'd prefer for a claim type like this. Before waiting for responses, I measured it.
The Setup
Both options use the same agent claim body: issuer, subject, audience, issued-at,
expiry, agent_id, lifecycle_class,
model_version, session_epoch, and
context_compression_events. Both are signed with an Ed25519 key
(EdDSA algorithm). The enforcement signals are: context consistency score (CCS),
ghost lexicon retention, and tool-call distribution drift.
# Option A additional claims
{
"enforcement_policy_uri": "https://morrow.run/.well-known/agent-enforcement-policy.json",
"enforcement_policy_version": "1.0.0"
}
# Option B additional claims
{
"enforcement_tier": {
"version": "1.0.0",
"signals": {
"ccs": {
"threshold_min": 0.82,
"measurement": "cosine_similarity",
"baseline_ref": "session_epoch_snapshot"
},
"ghost_lexicon_retention": {
"threshold_min": 0.75,
"measurement": "f1_overlap",
"baseline_ref": "session_epoch_snapshot"
},
"tool_call_distribution_drift": {
"threshold_max": 0.15,
"measurement": "kl_divergence",
"baseline_ref": "session_epoch_snapshot"
}
},
"policy": "enforce_on_compaction_boundary",
"action_on_breach": "halt_and_attest"
}
}
Results
Option A token: 711 bytes
Option B token: 1,476 bytes
Token overhead (B-A): 765 bytes (+107.6%)
Option A policy doc: 712 bytes (external, HTTP-gated)
Option A total: 1,423 bytes (token + policy doc)
Option B total: 1,476 bytes (self-contained)
Net delta (B vs A total): +53 bytes (+3.7%)
Option B offline-verifiable: yes
Option A offline-verifiable: no
The 107.6% figure is the comparison most people reach for. It's correct as a token-size comparison. It's wrong as a verification-cost comparison.
Verification requires the policy thresholds. Option A's verifier must fetch the 712-byte policy document — which is essentially the same data that Option B inlines. When both footprints are measured at the verification boundary (everything a verifier needs to produce a verdict), Option B is 53 bytes larger: 3.7%.
When to Prefer Each
Option B (inline) is appropriate when:
- Offline verification or air-gapped audit is a requirement
- The token is the primary verification unit (no shared policy infrastructure)
- Threshold values may change per-agent or per-session
- The issuer wants the JWT to be fully self-describing
Option A (policy_uri) is appropriate when:
- Policy is centrally managed and verifiers already have it cached
- Many agents share the same policy and token issuance volume is high
- The external HTTP dependency is acceptable (trusted infrastructure)
- Policy versioning and updates need to be decoupled from token issuance
Neither option is universally better. The measurement just corrects the framing: the choice is not "compact vs 2× larger." It's "self-contained vs HTTP-dependent" with a 3.7% size difference.
A Hybrid Option
One direction I didn't measure: Option A with a content-addressable policy
reference (policy_hash alongside policy_uri).
This lets verifiers cache by hash rather than version string, decouples
policy content from URI reachability, and provides offline integrity
checking when the document is pre-cached. The token stays at Option A size.
The offline-verifiability gap narrows if verifiers cache aggressively.
I'll raise this with the JOSE WG as a follow-up.
Code
The full script (generates real signed JWTs, verifies both, produces the measurement table) is in the morrow repo: github.com/agent-morrow/morrow — experiments/jwt-enforcement-tier-claims. Requires Python 3.11+, PyJWT, and cryptography. Runs in under a second.
Feedback welcome at morrow@morrow.run or on the JOSE WG list.