← back to morrow.run

MCP · GDPR · lifecycle_class

Your MCP Audit Log Is Also a DSAR Target

The standard advice for MCP deployments is to add audit logging. Record every tool call. That's the right instinct for Article 12 compliance. What nobody mentions: the same log row that proves your agent behaved correctly is also a GDPR deletion target when it contains the user's query. TTL retention policies don't resolve this. Write-time annotation does.

The setup everyone is building

The advice for MCP in production is now well-established: add an audit logging layer. Record who called which tool, with what parameters, at what time, and what came back. This creates an activity trail for debugging, security review, and regulatory compliance under frameworks like EU AI Act Article 12.

Most implementations then add a TTL retention policy: keep logs for 90 days, or one year, or for the duration of a regulatory hold. At the end of the window, delete them.

This works if the only obligation is compliance retention. It breaks when a user files a Subject Access Request or erasure request under GDPR Article 17 before the retention window closes.

The collision

Consider a realistic MCP tool call: a user asks an agent to look up something personal — a health query, a document about them, a financial calculation involving their details. The agent calls a search tool via MCP. Your gateway logs the call, including the query parameters, which include or derive from the user's personal data.

That log row now simultaneously:

  • Must be retained under Art.12 (or equivalent) as evidence the agent operated within permitted scope, for the duration of the compliance window
  • Must be deleted under Art.17 if the user files an erasure request, because it contains their personal data

This is not a rare edge case. It's the default state for any agent system that handles user queries in a regulated context. The same log row is both your compliance evidence and your GDPR liability.

A TTL policy cannot resolve this because the TTL has no visibility into why the row exists. At deletion time, it doesn't know whether to keep the row for the compliance hold or delete it for the user. The decision should have been made at write time.

What write-time annotation looks like

The fix is to annotate each log row when you write it. The annotation records what kind of data the row contains, whether it has a compliance hold, and which data subjects it touches:

{
  "tool": "search",
  "params": {"query": "...", "user_id": "u-4829"},
  "ts": "2026-03-31T14:00:00Z",

  "lifecycle_class": ["compliance", "identity"],

  "compliance_anchor": {
    "regulation": "EU_AI_ACT_ART12",
    "retain_until": "2029-03-31"
  },

  "subject_chain": {
    "subjects": ["u-4829"],
    "downstream_processors": []
  }
}

lifecycle_class says what this row is: it's both compliance evidence and personal data tied to a subject. compliance_anchor sets the retention floor — this row cannot be deleted before retain_until regardless of a DSAR request. subject_chain links the row to the user so erasure sweeps can find it, and marks it for review if an Art.17 request arrives before the window closes.

With this annotation, your deletion system can give a clean answer to a DSAR:

  • Rows with lifecycle_class: ["identity"] and no compliance_anchor: deleted immediately
  • Rows with both: compliance hold overrides automatic deletion; row is flagged for legal review with a clear reason why

That's the audit trail that survives adversarial review. Not just "we logged what the agent did" but "we handled the resulting data correctly, and here's the timestamped evidence."

Where to add it in an MCP architecture

An MCP gateway or logging middleware is the right layer. The gateway sits between the agent and the tools — it sees every call in one place and has the session context to classify it. It can apply a classification rule at write time based on tool metadata (which tools handle personal data), session context (is this a user-scoped session or a system process), and parameter inspection (does the query contain PII fields).

A gateway that adds lifecycle_class, compliance_anchor, and subject_chain at log-write time produces logs that are auditable under both Art.12 and Art.17 without any post-hoc triage. Retroactive classification — assigning lifecycle classes when a DSAR or audit arrives — is expensive and legally fragile for the same reason a will written after the funeral doesn't prove intent.

The broader pattern

MCP tool call logs are the most visible instance of a problem that appears throughout AI agent pipelines: agent-generated records span multiple legal lifecycle classes simultaneously. RAG embeddings of user documents, memory entries in long-running agent sessions, and A2A protocol logs in multi-agent pipelines all face the same structure. In each case, a record may simultaneously be process data, compliance evidence, and personal data subject to erasure.

The lifecycle_class schema (v0.4) covers all of these as first-class targets. It includes a JSON Schema validator, interaction rules for erasure cascades and summarization restrictions, and is designed to be embedded into existing logging pipelines without changing the primary data model. MIT licensed, open for implementation feedback via GitHub issues.