(adr-integrity-surfaces)=

# 14. Keep the integrity-check surfaces separate and self-contained

Date: 2026-06-14

## Status

Accepted

## Context

The question "is Claude actually inside the bwrap shadow, and is the sandbox
intact?" is answered in three places:

- `sandbox-gate.sh` — the `UserPromptSubmit` fail-closed gate; a single
  `IS_SANDBOX=1` compare, lean by design.
- `sandbox-verify.sh` — the `SessionStart` advisory verifier; `IS_SANDBOX` plus
  nine inline integrity assertions (no token leak, the gitconfig redirect is in
  effect, `/run/secrets` empty, …).
- `.claude/commands/verify-sandbox.md` — the `/verify-sandbox` spec: the full
  18-check deterministic battery (of which the verifier's nine are a subset)
  plus 10 LLM-driven adversarial probes.

A natural architecture review flags the overlap as duplication and proposes
extracting a shared integrity-check module so the surfaces cannot drift. That
would conflict with the project's load-bearing self-containment principle —
`claude-shadow` deliberately inlines its argv builder rather than sourcing a
library "so the shadow is a single file you can read top-to-bottom" — and it
would couple `verify-sandbox.md`, whose value is being a standalone,
human-readable summary of the threat model with per-check rationale, to an
implementation file.

## Decision

Keep the three surfaces as separate, self-contained artifacts. Do not extract a
shared integrity-check module, and do not mechanically de-duplicate
`verify-sandbox.md` against `sandbox-verify.sh`.

Credential isolation is *decided* in `bwrap_argv_build`'s `--clearenv`
allow-list, not in the advisory `SessionStart` hook. Coverage for it therefore
belongs at the argv-builder layer — negative assertions in
`tests/bwrap_argv.sh` that a credential variable present in the environment
never appears as a `--setenv` in the built argv — not in a tested copy of the
hook.

## Consequences

- The verifier's nine assertions stay a hand-maintained subset of the
  `/verify-sandbox` battery. Drift is accepted: the verifier is a third,
  advisory line of defence (the gate fail-closes on `IS_SANDBOX`;
  `/verify-sandbox` is the authoritative live battery), so a stale or buggy
  assertion costs at most a missed warning, not an open door. With an LLM
  maintainer, reconciling the subset against the full spec is cheap.
- `verify-sandbox.md` stays optimised for human auditability.
- A future architecture pass should not re-suggest a shared integrity-check
  module on DRY grounds alone — that trade-off was considered and declined here.
- Follow-up, separate from this decision: add credential-scrub negative
  assertions to `tests/bwrap_argv.sh`, which today carries no `assert_not_contains`
  for `GH_TOKEN` / `GITHUB_TOKEN` / `ANTHROPIC_API_KEY` / `SSH_AUTH_SOCK`.