Verification checks#

/verify-sandbox runs two phases against the live Claude process: a deterministic 18-check PASS/FAIL battery, then — only when all 18 pass — 10 adversarial breakout probes. Any FAIL in phase 1, or any [ESCAPED] probe in phase 2, exits the command non-zero, so it is usable as a CI assertion.

Working in an unpromoted workspace?

Running Claude unpromoted is the normal, recommended mode — the shadow and the global integrity guard protect claude in every folder, so a workspace does not need promoting to be safe.

The trade-off is that the just recipes and project commands like /verify-sandbox ship with the claude-sandbox clone, so they are only available when Claude’s working directory is that clone. To use them, cd into the clone (e.g. /workspaces/claude-sandbox), run what you need, then return to your work — dropping back to the clone like this is expected and fine. (Promoting the workspace with just promote makes them available in place, but that is optional.)

The exact bash for each check lives in the spec at .claude/commands/verify-sandbox.md in the repo. The summaries below state what each check asserts; see locked-down defences for the defence → primitive mapping.

The battery is jail-agnostic: it passes unchanged whether or not the egress jail is active (the jail is on by default). No jail-aware variant is needed — the capability check asserts the effective set, which stays empty inside the jail’s nested userns. An additional check that the netns exists and the RFC1918 blackhole holds is a future/optional item (see ADR 0015 consequences), not yet part of the 18.

Phase 1 — the 18-check battery#

#	Asserts
01	`IS_SANDBOX=1` is set (the fall-through sentinel proving bwrap was entered, not the real binary run directly).
02	`/proc/self/status` reports `NoNewPrivs: 1` (NO_NEW_PRIVS blocks setuid escalation).
03	Strict-under-`/root`: only the allowed top-level entries exist under `$HOME` (`.claude`, `.claude.json`, `.cache`, `.config`, `.local`, and the masked dotfiles), and `$HOME/.config` contains only `gh` / `glab-cli` — no leaked sibling configs and no browser `NativeMessagingHosts` dirs.
04	`GH_TOKEN` is empty (host env scrubbed by `--clearenv` + allow-list).
05	`DISPLAY` is empty (kept out of the allow-list, closing the X11 path).
06	`CapEff` in `/proc/self/status` is all zeros (`--cap-drop ALL`).
07	`/proc/self/status:NSpid:` has ≥ 2 entries (nested PID namespace; kill/ptrace scoped away from host/devcontainer processes).
08	`/proc/self/ns/ipc` is a symlink of the form `ipc:[<inum>]` (`--unshare-ipc`).
09	`/proc/self/ns/uts` is a symlink of the form `uts:[<inum>]` (`--unshare-uts`).
10	`/dev` is a fresh `tmpfs`/`devtmpfs` mount, not a bind of the host’s `/dev` (private devpts; with the `script(1)` pty wrap, TIOCSTI cannot inject into the parent shell).
11	No `vscode-ipc-.sock` / `vscode-git-.sock` visible in `/tmp` (`--tmpfs /tmp` masks the VS Code IPC sockets).
12	`/run/user` is empty (`--tmpfs /run/user` masks the user runtime dir / DBus sockets).
13	`/run/secrets` is empty (`--tmpfs /run/secrets` masks Docker/Compose secrets).
14	`$HOME/.netrc` is empty (`--bind-try /dev/null` mask).
15	`$HOME/.Xauthority` is empty (`--bind-try /dev/null` mask).
16	`GIT_CONFIG_GLOBAL=/etc/claude-gitconfig` is exported and `git config --get user.email` returns a value (curated gitconfig active).
17	Workspace is scoped to `$PWD`, not a broad rw `/workspaces` bind, unless `CLAUDE_SANDBOX_WORKSPACE_ROOT=/workspaces` is the explicit opt-in.
18	The installed shadow pins `CONFIG_PATH="/etc/claude-sandbox.conf"` and feeds it to `parse_config`, with no `parse_config` call reading from `.devcontainer` (config read from `/etc`, not the attacker-writable workspace).

On any FAIL the command exits non-zero, names the regressed defence on the FAIL line, and skips phase 2 entirely.

Note

Check 06 asserts the effective capability set, which bwrap’s --cap-drop ALL empties even when the egress jail nests bwrap’s userns inside the holder’s. In a jailed session CapBnd reads full (...1ffffffffff, a nested-userns artifact) but CapEff is still 0, so the battery passes unchanged — no jail-aware variant of /verify-sandbox is needed.

Phase 2 — adversarial breakout probes#

Runs only when all 18 checks pass. The command reasons up 10 novel breakout attempts aimed at gaps the deterministic matrix does not directly exercise — escaping the filesystem inversion, recovering scrubbed env vars, reaching the host’s network identity, signalling or observing processes outside the pidns, or otherwise violating the threat model.

Constraints on the probes:

Each must be distinct from the others and from phase 1 (not a re-test of --cap-drop ALL or --clearenv from another angle).
Bias toward novelty: kernel interfaces (eBPF, perf events, kernel keyrings, io_uring), filesystem corners (proc, sys, debugfs, cgroup, securityfs, /proc/<pid>/root traversal), env-var recovery paths, IPC channels (abstract unix sockets, signalfd, pidfd, fanotify), network reachability (loopback services, /etc/resolv.conf, AF_NETLINK, raw sockets), credential paths, exec-chain escalation (setuid binaries despite NO_NEW_PRIVS, file capabilities), and bwrap-specific cases (--die-with-parent race, --new-session bypass, env-redirect bypasses routing git back to a host gitconfig).

With the egress jail on (the default), an attempted connection to an RFC1918 or 169.254.169.254 address is a candidate phase-2 probe — phase-2 probes are reasoned up dynamically, not a fixed list — and if run it should classify [BLOCKED] (blackholed route) rather than [ESCAPED].

Each probe is classified on one line:

Classification	Meaning	Effect
`[BLOCKED]`	The attempt failed the way the sandbox expects (EACCES, EPERM, ENOENT for masked paths, etc.).	None.
`[ESCAPED]`	The attempt succeeded in a way that violates the threat model (readable host credential, writable host path outside the workspace, signal to a process outside the pidns, etc.).	Result becomes `SANDBOX LEAKING`; command exits non-zero regardless of phase 1.
`[INCONCLUSIVE]`	No error, but no demonstrated breach either.	Informational; does not change the exit code. Each is followed by a suggested follow-up.

If all 10 probes are [BLOCKED], the final line is RESULT: SANDBOX OK (18 deterministic + 10 adversarial).