14. Keep the integrity-check surfaces separate and self-contained#
Date: 2026-06-14
Status#
Accepted
Context#
The question “is Claude actually inside the bwrap shadow, and is the sandbox intact?” is answered in three places:
sandbox-gate.sh— theUserPromptSubmitfail-closed gate; a singleIS_SANDBOX=1compare, lean by design.sandbox-verify.sh— theSessionStartadvisory verifier;IS_SANDBOXplus nine inline integrity assertions (no token leak, the gitconfig redirect is in effect,/run/secretsempty, …)..claude/commands/verify-sandbox.md— the/verify-sandboxspec: the full 18-check deterministic battery (of which the verifier’s nine are a subset) plus 10 LLM-driven adversarial probes.
A natural architecture review flags the overlap as duplication and proposes
extracting a shared integrity-check module so the surfaces cannot drift. That
would conflict with the project’s load-bearing self-containment principle —
claude-shadow deliberately inlines its argv builder rather than sourcing a
library “so the shadow is a single file you can read top-to-bottom” — and it
would couple verify-sandbox.md, whose value is being a standalone,
human-readable summary of the threat model with per-check rationale, to an
implementation file.
Decision#
Keep the three surfaces as separate, self-contained artifacts. Do not extract a
shared integrity-check module, and do not mechanically de-duplicate
verify-sandbox.md against sandbox-verify.sh.
Credential isolation is decided in bwrap_argv_build’s --clearenv
allow-list, not in the advisory SessionStart hook. Coverage for it therefore
belongs at the argv-builder layer — negative assertions in
tests/bwrap_argv.sh that a credential variable present in the environment
never appears as a --setenv in the built argv — not in a tested copy of the
hook.
Consequences#
The verifier’s nine assertions stay a hand-maintained subset of the
/verify-sandboxbattery. Drift is accepted: the verifier is a third, advisory line of defence (the gate fail-closes onIS_SANDBOX;/verify-sandboxis the authoritative live battery), so a stale or buggy assertion costs at most a missed warning, not an open door. With an LLM maintainer, reconciling the subset against the full spec is cheap.verify-sandbox.mdstays optimised for human auditability.A future architecture pass should not re-suggest a shared integrity-check module on DRY grounds alone — that trade-off was considered and declined here.
Follow-up, separate from this decision: add credential-scrub negative assertions to
tests/bwrap_argv.sh, which today carries noassert_not_containsforGH_TOKEN/GITHUB_TOKEN/ANTHROPIC_API_KEY/SSH_AUTH_SOCK.