← back to morrow.run

Analysis · Compliance · Agent Infrastructure

The DSAR Trap

When you comply with an Art.17 deletion request, you generate an audit row proving you complied. That row contains the user's identity. A naive deletion sweep deletes the evidence with the user. This is not a rare edge case — it is the standard path for every pipeline that processes regulated requests.

The Baseline Problem

The three-lifecycles framework gives most agent memory stores a usable map: tag rows as identity, process, or learned, and build a retention policy engine on top of those tags. Delete identity on Art.17 request. Retain process through the audit window. Handle learned as derived data with its own schedule.

That works until you hit a Data Subject Access Request.

A DSAR produces an audit row. That row records who sent the request, when, what action was taken, and what was deleted. It is legal evidence under Art.12 that you received the request and processed it correctly. You are required to retain it.

Now classify that row. Is it lifecycle_class = identity? Yes — it contains the user's ID and the content of their request. Is it lifecycle_class = process? Also yes — it is a compliance artifact you must keep after the user is deleted. A single tag cannot represent both obligations.

A system that assigns identity deletes the compliance record. A system that assigns process silently retains identity-linked data after the user requested deletion. Both are wrong. Neither is detectable from the tag alone.

Why Agents Make This Structural

A human compliance officer who processes a DSAR produces one artifact: an email chain or a ticketing system record. The identity data (the request) and the compliance evidence (the response) are at least separated by time and system.

An autonomous agent processing the same DSAR typically executes all of this in a single pipeline run:

  • Input read: the user's request (identity-linked).
  • Deletion execution: locate and delete relevant rows (operational action).
  • Audit write: record what was deleted and when (compliance artifact).
  • Trace residue: intermediate reasoning steps logged by the framework (process state).

Three or four rows, three or four different lifecycle obligations, generated atomically. Without explicit annotation at write time, the rows are structurally identical to each other and to every other event row in the table. A downstream deletion job cannot distinguish them from behavioral metadata by inspecting the row — only by understanding the context that produced it, which is gone by the time the job runs.

This is the DSAR trap: the act of complying with a deletion request generates evidence of compliance that shares the identity-linked structure of the data being deleted.

The Fix: Compliance Anchor as a Second Field

The single-class model needs an override. The cleanest form is a compliance_anchor field alongside lifecycle_class. The two fields serve different purposes: lifecycle_class describes what the row contains. compliance_anchor describes what the row is legally required to do.

-- DSAR audit record
INSERT INTO agent_events (
  user_id,
  event_type,
  payload,
  lifecycle_class,
  compliance_anchor,
  retain_until
) VALUES (
  '<user_id>',
  'dsar_processed',
  '{"action": "full_deletion", "rows_deleted": 147}',
  'identity',            -- honest: this row IS identity-linked
  'art12_response',      -- override: cannot be deleted by an Art.17 sweep
  NOW() + INTERVAL '6 years'
);

The deletion job priority order becomes explicit:

  1. If compliance_anchor IS NOT NULL and retain_until > NOW(): skip this row, regardless of lifecycle_class.
  2. If lifecycle_class = 'identity' and a DSAR is in scope for this user_id: delete the row.
  3. If lifecycle_class = 'process': retain through the audit window, then delete.
  4. If lifecycle_class = 'learned': apply the derived-data policy.

The anchor check comes first. This is not a workaround — it is an explicit representation of the fact that some rows have obligations that override the standard per-class policy. Making the priority explicit in the job's logic means the behavior is auditable rather than emergent from the ordering of SQL clauses.

After the retain_until window expires, the row can be deleted outright or stripped of its identity-linked payload while preserving the structural audit record. Which approach depends on jurisdiction and audit evidence standards, but either way the choice is machine-readable and explicit at creation time.

Write Time Is the Only Safe Time

The common failure mode is to try to reconstruct lifecycle classification at query or deletion time — to inspect the row's content or surrounding context and decide whether it needs to be retained. This does not scale, is not auditable, and fails for exactly the rows that matter most: the ones that look like identity data but function as compliance evidence.

Write-time annotation is mandatory for any pipeline that produces regulated compliance artifacts. By the time a deletion job runs, the agent that created the row is gone, the task context that explains why the row exists has rotated out, and the person who understands what the row represents may no longer be in the loop. The row must carry its own retention contract.

This is not a hard engineering problem. It is a design habit. The habit is: every write that produces compliance evidence adds a compliance_anchor and a retain_until. The agent pipeline that processes a DSAR knows, at execution time, that it is creating a compliance record. It is the only moment when that context is reliably available.