Why Anthropic and Vercel chose different sandboxes

/Metadata

Anthropic uses bubblewrap for Claude Code, gVisor for Claude web. Vercel uses Firecracker. Vercel also built just-bash — simulated bash in TypeScript, no OS at all.

Four different answers from teams that thought hard about the problem. All four are right.

The difference isn’t engineering skill. It’s constraints.

Four approaches

OS-level primitives. Linux has bubblewrap. macOS has seatbelt. These are lightweight — no containers, no VMs. You’re restricting what a process can access using kernel-level enforcement. Fast startup, minimal overhead, works anywhere.

Userspace kernels. gVisor intercepts syscalls and handles them in a Go program pretending to be a Linux kernel. Your container thinks it’s talking to an OS, but it’s talking to gVisor. Stronger isolation than containers, weaker than VMs. Works anywhere Docker runs.

MicroVMs. Firecracker boots a real VM in ~125ms with ~5MB memory overhead. True hardware-level isolation. The catch: needs KVM access, which means bare metal or nested virtualization. Operationally heavier.

Simulated. No real OS at all. just-bash is a TypeScript implementation of bash with an in-memory filesystem. Your agent thinks it’s running shell commands, but it’s all JavaScript. Zero syscall overhead, instant startup, works in the browser.

Who chose what

Anthropic (Claude Code CLI) uses OS-level primitives. They open-sourced it as sandbox-runtime — bubblewrap on Linux, seatbelt on macOS. No containers. Network traffic routes through a proxy that enforces domain allowlists. This makes sense for a CLI tool running on your laptop. You don’t want to install Docker just to use Claude Code.

Anthropic (Claude web) uses gVisor. I reverse-engineered this a few months ago — the runsc hostname, the custom init process, the JWT-authenticated egress proxy. When you’re running thousands of concurrent sandboxes in the cloud, gVisor’s balance of isolation and operational simplicity wins.

Vercel uses Firecracker. Their Sandbox product runs each execution in a microVM. They already operate Firecracker for their build infrastructure, so the operational complexity is amortized. For a managed platform selling isolation as a feature, the stronger guarantee matters.

Vercel (lightweight option) also built just-bash — a simulated bash environment in TypeScript with an in-memory filesystem. No real OS at all. For agents that just need to manipulate files and run simple commands, this avoids the overhead entirely. Worth exploring for lightweight use cases.

The trade-offs

Approach	Startup	Isolation	Ops complexity	When to use
OS-level (bubblewrap/seatbelt)	<10ms	Process-level	Low	CLI tools, local dev
gVisor	~500ms	Container+	Medium	Cloud workloads, multi-tenant
Firecracker	~125ms	VM-level	High	Managed platforms, paranoid workloads
Simulated (just-bash)	<1ms	Application-level	None	Simple file/text manipulation

How to pick

You’re building a CLI tool. Use OS-level primitives. Users won’t tolerate installing Docker. Anthropic’s sandbox-runtime is Apache-licensed and battle-tested.

You’re running agents in the cloud. Use gVisor. It works in standard Kubernetes, no special node configuration. The ~500ms cold start hides behind LLM inference latency anyway.

You’re a platform selling sandboxing. Consider Firecracker. The operational cost is worth it when isolation is your product. But only if you control the infrastructure.

Your agent just processes text and files. Consider a simulated environment like just-bash. No syscall overhead, no container startup, instant execution. Pair it with real sandboxing for anything that needs actual binaries.

The pattern

Everyone converged on the same insight: network isolation matters as much as filesystem isolation.

Anthropic’s sandbox-runtime routes traffic through a proxy. Their web sandbox uses JWT-authenticated egress. Vercel’s just-bash requires explicit URL allowlists for curl.

Disabling network entirely is too restrictive — agents need pip install, npm install, git clone. But allowing arbitrary network access is too dangerous — agents could exfiltrate data. The answer is a proxy with an allowlist.

This pattern appears in every serious sandboxing implementation I’ve seen. If you’re building your own, start here.

The sandbox landscape matured fast. A year ago, you had to figure this out yourself. Now there’s open-source code from Anthropic, managed infrastructure from Vercel, and clear patterns to follow.

Pick the approach that fits your constraints, don’t over-engineer, and ship.