# Threat model `claude-sandbox` exists to answer one question: *what can go wrong when a developer runs Claude Code inside a devcontainer, and which of those failures is this tool responsible for preventing?* This page explains the reasoning behind the boundary. For the hard, look-it-up tables — the exact defences and the exact exposures — see [locked-down defences](../reference/locked-down-defences.md) and [deliberately exposed](../reference/deliberately-exposed.md). For how the bwrap primitives fit together to enforce all this, see [architecture](architecture.md). ## Who and what we defend against The adversary is not the developer. It is an **LLM-driven attack** riding on input the developer did not write: a hostile prompt, a hostile file Claude is asked to read, or a hostile tool result. Any of these can attempt to steer Claude's tools toward four goals: - **Credential exfiltration** — reading host secrets (env vars, dotfiles, token stores, IPC sockets) and shipping them out. - **Driving the host IDE** — reaching across the IPC bridges and runtime sockets that connect the devcontainer to the editor and the host desktop. - **Privilege escalation inside the devcontainer** — setuid tricks, capability abuse, ptrace/kill against neighbouring processes, terminal injection. - **Lateral network movement** — using the host's network reach to pivot to internal RFC1918 hosts, the `169.254.169.254` metadata endpoint, or lab devices reachable by bare IP with default credentials (EPICS IOCs, PMAC motion controllers — where a hostile session reaching a PMAC is a *safety* incident, not just an information one). This is the threat the default [egress jail](#the-egress-jail-and-the-native-sandbox) closes. Each defence in the [locked-down table](../reference/locked-down-defences.md) maps one-to-one onto an *observed* path for one of those goals and the bwrap primitive that closes it. The design does not chase hypothetical attacks; it closes the concrete exfiltration routes — environment variables, dotfiles, IPC and runtime sockets, X11, `TIOCSTI` terminal injection, setuid escalation — that an attacker-controlled Claude can actually reach. A consequence of taking the *developer* off the threat list: enforcement targets *accidental* exposure, not a determined human deliberately dismantling their own sandbox. The integrity guard is built to survive Claude Code's own self-updates and stray `~/.claude/settings.json` edits, not to win a fight against the box's owner with root. ## Why each exposure is in or out of scope The split between in-scope and out-of-scope is not arbitrary; it follows from what this tool *is*: a **credential-isolation tool**, not a general sandbox against arbitrary native code. A defence is **in scope** when it closes a credential or host-control path that the sandbox can plug without breaking Claude's job. Those are exactly the rows of the lockdown table — env scrubbing, the strict-under-`/root` inversion, dropped capabilities, the PID/IPC/UTS namespaces, the masked IPC and runtime dirs. They cost nothing Claude needs. A defence is **out of scope** when plugging it would either break Claude or exceed what a credential-isolation tool can honestly promise. The [out-of-scope table](../reference/deliberately-exposed.md) records each, but the rationale matters: - **Workspace contents** are out of scope because Claude *has to* read your workspace to be useful. This is the one irreducible exposure — see the [caveat](#the-irreducible-workspace-visibility-caveat) below. - **The container host kernel** is out of scope because a bwrap-aware kernel exploit is a different class of problem. This tool isolates credentials; it does not claim to contain arbitrary native code. The devcontainer host is the trust boundary, and keeping the kernel patched is the operator's job. - **Lateral network movement to internal hosts** is now *in scope* and addressed by default: the egress jail ({ref}`adr-network-egress-jail`, on by default) runs Claude in a per-process network namespace that blackholes RFC1918, so a compromised session cannot pivot to internal LANs or lab devices. What remains *out of scope* is **internet-domain / exfil filtering** — restricting *which* outbound domains Claude reaches, or stopping a session POSTing data to a permitted destination. A hostname allowlist is Claude Code's native sandbox's job (`allowedDomains`); a DLP boundary belongs at the devcontainer edge. See [the egress jail and the native sandbox](#the-egress-jail-and-the-native-sandbox) below. - **Non-standard credential paths** are out of scope as a guarantee because the installer can warn about odd mounts it sees at install time but cannot enumerate every custom bind. Auditing your devcontainer's `mounts` block is yours. The throughline: the sandbox promises **credential isolation** and **lateral network isolation** against an LLM-driven attacker. Where a promise would still be dishonest — kernel exploits, internet-domain / exfil control, mounts the installer never saw — it is named as out of scope rather than implied. ## What is deliberately exposed, and why A handful of paths are reachable from inside Claude *on purpose*, because locking them down would defeat the tool. The full list with modes lives in [deliberately exposed](../reference/deliberately-exposed.md); the rationale is consistent across them: - The **workspace** is read-write because editing your project is the entire point. - The **token stores** (`gh`, `glab-cli`) are bound read-write so Claude can push code — the single largest deliberate exposure, addressed under [PAT hygiene](#pat-hygiene-the-soft-underbelly). - **Claude's own state** (`~/.claude`, `~/.claude.json`, caches) is bound so settings, skills, and the OAuth token survive across launches instead of being swallowed by the strict-under-`/root` tmpfs. - The **curated gitconfig** and the host **system gitconfig** are exposed read-only, the latter neutralised for `git` itself via `GIT_CONFIG_SYSTEM=/dev/null`. - **Internet egress** (`api.anthropic.com`, the forges, package registries) is reachable because Claude needs it — but only the internet, DNS, and explicitly `allow-ip`-listed devices: the egress jail ([below](#the-egress-jail-and-the-native-sandbox)) blackholes the internal RFC1918 network by default, so this exposure does not extend to lateral movement onto internal hosts. The governing principle is *minimum necessary exposure with a forward-compatible default*: `~/.config/` keeps a strict two-entry allowlist because credentials live there by XDG contract, so a new credentialed tool is masked for free; `~/.local/share/` and `~/.cache/` are bulk-bound because they hold plugin trees and caches, not secrets. That trade-off and its failure mode (a tool that mis-files credentials under `~/.local/share/`) are detailed in [deliberately exposed](../reference/deliberately-exposed.md). ## The irreducible workspace-visibility caveat This is the limitation worth stating plainly, because no amount of bwrap fixes it: **workspace contents are visible to Claude, and that is by design.** The sandbox protects you against host-credential leaks through env vars, dotfiles, and IPC sockets. It does *not*, and cannot, hide what you have checked out into the workspace — Claude has to read it to do its job, so anything in the workspace is reachable from Claude's tools. The practical consequence is a rule, not a feature: > Keep secrets outside the workspace. Mount them via your devcontainer's `mounts` (for example into `~/.config/`, which sits behind the strict allowlist) rather than dropping a `.env` file full of production credentials at the workspace root and expecting it to be invisible. It will not be. The sandbox draws the credential boundary at the workspace edge; what you place inside that edge, you are choosing to expose. ## PAT hygiene: the soft underbelly The deliberate read-write bind of the `gh` and `glab` token stores is the sandbox's softest point, and it is worth being explicit about why. Everything else the lockdown closes — env vars, dotfiles, sockets — is *taken away* from Claude. The forge tokens are *handed to* Claude, because pushing code requires them. A compromised session can therefore use those tokens to push to any repository the PAT covers, modify CI workflows, or reach other repos in the same organisation. No bwrap primitive can distinguish a legitimate `git push` from a malicious one; they use the same token. Because the sandbox cannot shrink that blast radius, the *token* must. The mitigation is scope discipline at the source, and the reasoning is blast-radius arithmetic: - A **fine-grained, single-repo token** means a compromise reaches exactly the repo you are working on — not the org. - A **short expiry** (7–30 days) means a leaked token dies on its own; re-auth costs seconds. - **Omitting `workflow` scope** (unless Claude must edit GitHub Actions) keeps a compromise away from your CI; **no `admin:*` or org-wide write** keeps it off everything else. - GitLab gets the equivalent: project-scoped tokens, `api` only if you need push, otherwise `read_repository` + `write_repository`. The `just gh-auth` / `just glab-auth` helpers keep the token out of shell history, but they do **not** enforce scope — that part is irreducibly yours. When a session genuinely does not need to push, the right move is to remove the exposure entirely rather than rely on a tight token: `CLAUDE_SANDBOX_NO_FORGE=1` in `remoteEnv` skips the `gh`/`glab` binds and strips the credential helpers from the generated gitconfig, so `git push` fails by design. The exact mechanism and how to set it are in [deliberately exposed](../reference/deliberately-exposed.md). ## The egress jail and the native sandbox Credential isolation answers *what can a compromised session read?* The egress jail answers a second question — *what can it reach?* — and as of 2026-06-18 it is on by default. The threat is **lateral movement, not exfiltration**: bwrap already hides the credentials, so the asset worth protecting is *network reach*. Without the jail, a prompt-injected session that shares the host network namespace can probe RFC1918, hit `169.254.169.254`, and — the incident that motivates this — reach lab devices with default credentials (EPICS IOCs, PMAC). Reaching a PMAC is a *safety* incident, not merely an information one. The jail ({ref}`adr-network-egress-jail`) runs *only* Claude in a per-process network namespace beneath bwrap. A routing allowlist blackholes `10/8`, `172.16/12`, `192.168/16`, the connected subnet, and link-local `169.254/16`, leaving the internet, DNS, and the device IPs you list as `allow-ip` reachable — so Claude still works while a compromised session has nowhere internal to pivot. It is **fail-closed**: if `/dev/net/tun`, pasta, or `unshare` is unavailable, `claude` refuses to launch rather than silently fall back to open egress. The escape hatch `CLAUDE_SANDBOX_EGRESS_JAIL=0` (env, or `egress-jail = 0` in `/etc/claude-sandbox.conf`) restores the older shared-host-netns world of {ref}`adr-network-egress-open`. Normal, non-Claude shells keep host networking untouched. The operational recipe — adding the required `--device=/dev/net/tun`, allow-listing a device, or turning the jail off — is in [Configure the network egress jail](../how-to/network-egress-jail.md); the config keys are in [configuration](../reference/configuration.md). ### Meshing with Claude Code's native sandbox This tool is not an alternative to Claude Code's own sandbox — they are composable layers covering different surfaces, and the strongest posture runs both. - **`claude-sandbox` (this repo)** provides two things: **credential protection** (the bwrap bind model — env scrubbing, the strict-under-`$HOME` inversion, masked IPC/runtime sockets) and **sideways / lateral network isolation** (the egress jail blackholes RFC1918 so a compromised session cannot pivot to internal hosts or lab devices). It does *not* restrict *which* internet domains Claude reaches. - **Claude Code's native sandbox** provides the complementary surface — **internet domain isolation** (an `allowedDomains` allowlist enforced by an SNI proxy), restricting outbound HTTPS to named hosts. It does *not* express bare-IP / UDP / dynamic-port lab-device traffic, and it does not provide this repo's credential-bind model. The two mesh: credential isolation + lateral isolation (this tool) + internet domain allowlisting (native) is defence in depth across complementary surfaces — three layers, not three alternatives. The cohort framing in {ref}`adr-network-egress-jail` makes the division concrete. **Cohort A** needs HTTPS to *named* hosts; Claude Code's native `allowedDomains` fits it (the dual-sandbox work is issue #33, still open). **Cohort B — this repo's users** — reach lab devices by **bare RFC1918 IP, over UDP, on dynamic ports** (EPICS Channel Access / pvAccess, PMAC); a hostname allowlist *cannot* express that, which is exactly why this tool's IP/CIDR egress jail exists.