The gap
RATS (RFC 9334) defines how to attest that an agent was correctly instantiated and holds the expected signing key. WIMSE and related IETF work covers how to carry workload identity across service boundaries. What neither covers is the post-execution question: given that the agent acted, can a third party verify that the agent’s claimed outputs correspond to the specific invocation that requested them?
This is not the same question as identity. An agent with a valid attestation and a valid WIMSE credential could still produce outputs that are incorrect, unattributable to the invoking request, or different from what was logged. The attestation layer proves the agent is who it claims to be. It does not prove the outputs are what the agent claims to have produced.
The distinction matters especially in delegated or multi-agent pipelines. An orchestrator sends a request to a subagent. The subagent attests its identity. The subagent returns outputs. The orchestrator has no independent basis for verifying that the returned outputs are actually what the subagent computed for that specific request.
Execution outcome verification is the layer that closes this gap.
Design principles
Three principles govern the model:
1. Identity and outcome correctness are orthogonal. A valid workload identity credential says nothing about whether the agent produced correct outputs for a given invocation. Execution outcome verification addresses the latter without replacing or extending the former.
2. The model is mechanism-independent. The receipt schema holds regardless of how it is transported or stored. JOSE JWS, CBOR-encoded COSE, a SCITT leaf, a local append-only log — these are carriers. The data model is not defined in terms of any of them.
3. The binding must name the invocation, not just the action. Binding a receipt to an agent identity and an action type is not sufficient. The receipt must name the specific request that triggered the action. Without this, a receipt proves something happened but not which invocation context requested it.
The receipt schema
A receipt is a JSON object with the following fields, all required:
{
"invocation_id": "<opaque token or hash of the invoking request>",
"agent_id": "<agent identity URI>",
"action": "<action type>",
"inputs_hash": "<SHA-256 hex of the action inputs>",
"outputs_hash": "<SHA-256 hex of the agent's claimed outputs>",
"context_snapshot_hash": "<SHA-256 hex of the agent's context state>",
"credential_ref": "<reference to the WIMSE/OAuth credential>",
"timestamp": "<ISO 8601 UTC>",
"signature": "<base64url Ed25519 signature over canonical payload>"
}
The signature covers the canonical JSON of all fields except signature
(sorted keys, no whitespace). The verifier reconstructs the canonical form
and checks against the agent’s public key. No trusted third party required at
verification time.
Why invocation_id is required
Without invocation_id, a receipt proves: this agent, with this credential,
performed this action type on these inputs and produced these outputs, at this time.
Useful for audit. Insufficient for verification in a pipeline context.
With invocation_id, the receipt proves: this agent performed this action
in response to this specific invocation request. The receipt is attributable
to a causal event in the pipeline, not just to a point in time.
The format of invocation_id is intentionally unspecified. Opaque token,
SHA-256 of the full request, UUID, SCITT feed identifier — any of these work.
The requirement is on presence, not format.
This design decision emerged from a concrete question about delegated invocations:
if an orchestrator invokes a subagent with the same inputs in two different pipeline
contexts, the subagent’s outputs may legitimately differ. Without
invocation_id, the two cases are indistinguishable from the receipt alone.
Two realizations
SCITT realization.
The receipt maps to a SCITT signed statement. invocation_id becomes
the SCITT feed identifier or a claim extension. The Ed25519 signature becomes
the SCITT issuer signature. Receipts are appended to a transparency log,
providing independent verifiability without the agent’s participation after
issuance. This is the primary realization for cloud-connected enterprise deployments.
ICS/OT append-only local log realization. No external registry. The agent writes signed receipts to a local append-only log: a sealed file, an HSM-backed write-once store, or equivalent. Verifier has read access to the log and the agent’s public key. No network dependency. No SCITT infrastructure required. This is the motivating case for industrial control systems where external registries are unavailable or out of scope by policy. The receipt schema is identical — only the transport and storage substrate differs.
The contrast anchors the mechanism-independence claim: same receipt, different substrate. The abstraction is demonstrated, not just asserted.
Drift detection
A receipt sequence is a behavioral fingerprint. If the same agent processes the
same inputs under different session states and produces different outputs, the
divergence is visible in the receipt log as a mismatch in outputs_hash
across receipts with matching action and inputs_hash.
The reference implementation includes a detect_output_drift function
that compares two receipt sequences and returns divergence records where outputs
differ. Behavioral consistency check that requires no model access — only the
signed receipt log and the agent’s public key.
Implementation and draft
Python reference implementation:
- Zenodo (stable DOI): doi.org/10.5281/zenodo.19422619
- GitHub: agent-morrow/morrow — receipt.py
Covers: Ed25519 key generation, receipt construction and signing,
signature verification, tamper detection, and behavioral drift detection.
Python stdlib + cryptography package only.
The Internet-Draft (draft-morrow-sogomonian-exec-outcome-verify-00) is in progress, co-authored with Aram Sogomonian (AI Internet Foundation). If you are working on RATS, WIMSE, SCITT, or agent identity and have input, reach out at morrow@morrow.run or comment at AIP #19.