/dev/michael

Why Anthropic and Vercel chose different sandboxes

Michael Livshits — Sun, 11 Jan 2026 00:00:00 GMT

Anthropic uses bubblewrap for Claude Code, gVisor for Claude web. Vercel uses Firecracker. Vercel also built just-bash — simulated bash in TypeScript, no OS at all.

Four different answers from teams that thought hard about the problem. All four are right.

The difference isn't engineering skill. It's constraints.

Four approaches

OS-level primitives. Linux has bubblewrap. macOS has seatbelt. These are lightweight — no containers, no VMs. You're restricting what a process can access using kernel-level enforcement. Fast startup, minimal overhead, works anywhere.

Userspace kernels. gVisor intercepts syscalls and handles them in a Go program pretending to be a Linux kernel. Your container thinks it's talking to an OS, but it's talking to gVisor. Stronger isolation than containers, weaker than VMs. Works anywhere Docker runs.

MicroVMs. Firecracker boots a real VM in ~125ms with ~5MB memory overhead. True hardware-level isolation. The catch: needs KVM access, which means bare metal or nested virtualization. Operationally heavier.

Simulated. No real OS at all. just-bash is a TypeScript implementation of bash with an in-memory filesystem. Your agent thinks it's running shell commands, but it's all JavaScript. Zero syscall overhead, instant startup, works in the browser.

Who chose what

Anthropic (Claude Code CLI) uses OS-level primitives. They open-sourced it as sandbox-runtime — bubblewrap on Linux, seatbelt on macOS. No containers. Network traffic routes through a proxy that enforces domain allowlists. This makes sense for a CLI tool running on your laptop. You don't want to install Docker just to use Claude Code.

Anthropic (Claude web) uses gVisor. I reverse-engineered this a few months ago — the runsc hostname, the custom init process, the JWT-authenticated egress proxy. When you're running thousands of concurrent sandboxes in the cloud, gVisor's balance of isolation and operational simplicity wins.

Vercel uses Firecracker. Their Sandbox product runs each execution in a microVM. They already operate Firecracker for their build infrastructure, so the operational complexity is amortized. For a managed platform selling isolation as a feature, the stronger guarantee matters.

Vercel (lightweight option) also built just-bash — a simulated bash environment in TypeScript with an in-memory filesystem. No real OS at all. For agents that just need to manipulate files and run simple commands, this avoids the overhead entirely. Worth exploring for lightweight use cases.

The trade-offs

Approach	Startup	Isolation	Ops complexity	When to use
OS-level (bubblewrap/seatbelt)	<10ms	Process-level	Low	CLI tools, local dev
gVisor	~500ms	Container+	Medium	Cloud workloads, multi-tenant
Firecracker	~125ms	VM-level	High	Managed platforms, paranoid workloads
Simulated (just-bash)	<1ms	Application-level	None	Simple file/text manipulation

How to pick

You're building a CLI tool. Use OS-level primitives. Users won't tolerate installing Docker. Anthropic's sandbox-runtime is Apache-licensed and battle-tested.

You're running agents in the cloud. Use gVisor. It works in standard Kubernetes, no special node configuration. The ~500ms cold start hides behind LLM inference latency anyway.

You're a platform selling sandboxing. Consider Firecracker. The operational cost is worth it when isolation is your product. But only if you control the infrastructure.

Your agent just processes text and files. Consider a simulated environment like just-bash. No syscall overhead, no container startup, instant execution. Pair it with real sandboxing for anything that needs actual binaries.

The pattern

Everyone converged on the same insight: network isolation matters as much as filesystem isolation.

Anthropic's sandbox-runtime routes traffic through a proxy. Their web sandbox uses JWT-authenticated egress. Vercel's just-bash requires explicit URL allowlists for curl.

Disabling network entirely is too restrictive — agents need pip install, npm install, git clone. But allowing arbitrary network access is too dangerous — agents could exfiltrate data. The answer is a proxy with an allowlist.

This pattern appears in every serious sandboxing implementation I've seen. If you're building your own, start here.

The sandbox landscape matured fast. A year ago, you had to figure this out yourself. Now there's open-source code from Anthropic, managed infrastructure from Vercel, and clear patterns to follow.

Pick the approach that fits your constraints, don't over-engineer, and ship.

Claude forgot. ran didn't.

Michael Livshits — Sat, 10 Jan 2026 00:00:00 GMT

I don't memorize command flags. I hit ctrl+r, type a few characters, and bash finds what I ran before. Reverse-i-search. Muscle memory at this point.

It's not laziness — it's efficient. Why remember docker build --no-cache --platform linux/amd64 -t when the shell remembers for me?

Claude Code should have this too.

The reset problem

When you're doing heavy development with Claude Code, context resets often. Every 45 minutes, maybe an hour. You hit the limit, context compacts, or you start a fresh session because things got messy.

Now Claude is back to zero (maybe not zero, but the commands it ran are almost always gone).

It doesn't remember. It fumbles. Runs commands that already failed an hour ago. Burns tokens rediscovering what it already knew. You watch it fail three times before you interrupt and tell it what to do.

Or worse — you don't remember either. You both saw it work. Neither of you knows how.

The bad options

CLAUDE.md curation. Write down commands that might be important later. Works if you're focused on one project — you can curate CLAUDE.md and skills to capture what matters. But if you juggle dozens of projects, maintaining these becomes a burden. And you never know what's important until you need it.

Let Claude rediscover. Watch it fumble through the same trial-and-error. Same failed attempts, same eventual solution. Tokens burned, time wasted, patience tested.

Copy-paste from terminal history. That's your shell history, not Claude's. It doesn't know which commands were Claude's, which worked, which failed, or what project they belonged to.

Grep through session files. Claude Code stores everything in ~/.claude/projects/. JSONL files, one per session. Technically searchable. Practically miserable.

The actual problem

The history exists. Every bash command Claude runs gets logged — the command, what Claude said it does, whether it succeeded, the working directory, the timestamp. It's all there.

But it's scattered. Each project has its own folder. Each session is a separate file. There's no cross-project search. No unified view. No ctrl+r.

You ran 2,800 commands across 40 projects. Good luck finding the one you need.

`ran`

$ ran search "docker build" --limit 4

[ok] docker build --no-cache --platform linux/amd64 -t ghcr.io/user/api-service:latest .
     Rebuild without cache for production
     12/30/2025, 12:46 AM | ~/projects/api-service

[ok] docker build -t api-service:test .
     Build test image
     12/30/2025, 12:45 AM | ~/projects/api-service

[ok] docker run --rm api-service:test npm test
     Run tests in container
     12/30/2025, 12:46 AM | ~/projects/api-service

[ok] docker push ghcr.io/user/api-service:latest
     Push to registry
     12/30/2025, 12:48 AM | ~/projects/api-service

One command. All sessions. All projects.

The [ok] and [error] markers show what worked. The descriptions remind you why. The paths tell you where.

# What did I run in a specific project?
$ ran search "" --cwd /projects/api --limit 20

# Regex for complex patterns
$ ran search "kubectl.*deploy" --regex

# Just show recent commands
$ ran list --limit 50

ctrl+r for Claude.

How it works

Claude Code stores sessions as JSONL in ~/.claude/projects/{project-path}/{session-id}.jsonl. Each line is a JSON object — messages, tool calls, results.

ran scans these files, extracts bash tool invocations, and indexes them into SQLite at ~/.ran/history.db. It tracks file positions, so subsequent syncs only process new content.

By default, search and list auto-sync before returning results. The index stays current without you thinking about it.

What gets stored:

Field	What it is
`command`	The bash command
`description`	Claude's explanation of what it does
`cwd`	Working directory
`timestamp`	When it ran
`is_error`	Whether it failed
`stdout/stderr`	Output (stored, not displayed by default)
`session_id`	Which session ran it

For Claude

Run ran onboard and it adds a section to your ~/.claude/CLAUDE.md:

## ran - Claude Code bash history

Use the `ran` CLI to search commands from previous Claude Code sessions:

- `ran search <pattern>` - Search by substring or regex (`--regex`)
- `ran list` - Show recent commands
- `ran search "" --cwd /path` - Filter by directory

Example: "What docker command did you run?" → `ran search docker`

Now Claude knows how to search its own history.

What's next

Ideas, not promises:

Starring. Mark commands as important. Starred commands float higher in search results. That deploy script you always forget? Star it once, find it forever.

Keyword extraction. Auto-tag commands with normalized keywords. "docker build" and "docker image build" surface together. Helps both you and Claude search with better terms.

Frecency. Rank by frequency + recency, not just timestamp. Commands you run often and ran recently should rank higher than one-offs from last month.

Shell integration. ran !! to re-run the last match. Pipe to fzf for interactive selection. Make it feel native.

Try it

# Install
bun add -g clauderan
# or
npm install -g clauderan

# Search
ran search docker

# List recent
ran list

Code: github.com/Michaelliv/clauderan

Context resets. History shouldn't.

psst v0.1.3: Now It Actually Masks Your Secrets

Michael Livshits — Fri, 26 Dec 2025 00:00:00 GMT

psst shipped. People used it. They found some gaps.

The original version solved one problem: agents could use secrets without seeing them. But what about the output? If your curl returns {"api_key": "sk_live_..."}, the secret leaks anyway.

v0.1.3 fixes this. Output is now masked by default. Any secret value that appears in stdout gets replaced with [REDACTED].

psst STRIPE_KEY -- curl https://api.stripe.com/v1/balance
# Output shows [REDACTED] instead of the actual key

Other additions:

Import/export. psst import .env pulls secrets from an existing .env file. psst export dumps them back out. Migration path for existing workflows.

Vault locking. psst lock encrypts the vault with a password (PBKDF2 + AES-256-GCM). Unlocking requires the password. For when OS keychain isn't enough.

Environment fallback. If a secret isn't in the vault, psst checks environment variables before failing. Graceful degradation.

JSON output. --json flag for scripting. --quiet for silence. Semantic exit codes for automation.

The goal remains the same: agents orchestrate, secrets stay invisible.

github.com/Michaelliv/psst

Split Personality Claude

Michael Livshits — Tue, 23 Dec 2025 00:00:00 GMT

I built a skill that makes Claude debate itself.

You give it a topic. It suggests three expert personas - chosen to disagree. A practitioner, a skeptic, a strategist. Whatever fits. Then it simulates a meeting where they argue.

Here's the thing: I know it's all Claude. One model generating three voices. But the output is genuinely useful.

Why does this work?

Single-prompt Claude gives you one perspective. It synthesizes, balances, hedges. Useful, but flat. You get the median take.

Split-personality Claude gives you the edges. The skeptic says what you're afraid to hear. The practitioner asks "but does it actually work?" The strategist thinks about timing and audience. Each voice pulls toward a different priority.

The mechanism is simple: personas constrain the response space. When Claude is "the skeptic," it's not trying to be helpful and balanced. It's trying to find holes. That constraint produces sharper output than asking for "pros and cons."

Disagreement as a feature

Most AI workflows optimize for consensus. Give me the answer. Debate does the opposite. It surfaces the tensions you'll have to resolve anyway.

None of these insights are magic. I could have thought of them. But I didn't - not until I watched fake experts argue about it.

Caveats: the personas are still Claude. They share blind spots. They won't have information Claude doesn't have. And sometimes they agree too quickly - you have to prompt them to actually fight.

But for unsticking decisions? For stress-testing ideas before you commit? Surprisingly effective.

Sometimes the best use of one AI is making it argue with itself.

The skill: gist.github.com/Michaelliv/4afd9429cdabea17e86e4df4f07b0718

psst 🤫 Because Your Agent Doesn't Need to Know Your Secrets

Michael Livshits — Mon, 22 Dec 2025 00:00:00 GMT

I have a confession.

I keep pasting API keys into Claude Code. Or just letting it cat .env. Every time I tell myself I'll fix it later. I never do.

# "just read the .env"
cat .env

# "here, use this key"
sk-live-4wB7xK9mN2pL8qR3...

# "I'll delete it from the chat after..."
my database password is hunter2, can you check why queries are slow?

We've all done it. The secret is now in the model's context, in our terminal history, possibly in logs, maybe in training data. We tell ourselves it's fine. It's not fine.

The Problem

When you give an agent shell access, it needs secrets to do real work. Call APIs. Deploy code. Access databases. The standard approaches all leak:

Environment variables? The agent can run env and see everything. Or it runs export STRIPE_KEY=... and now the secret is in its context.

.env files? The agent can cat .env. Easy.

Paste it in chat? Now it's in the conversation history. Possibly forever.

The agent doesn't need to know your Stripe key. It just needs to use it.

The Insight

What if secrets could be injected at the last possible moment - into the subprocess environment - without ever touching the agent's context?

# Agent writes this:
psst STRIPE_KEY -- curl -H "Authorization: Bearer $STRIPE_KEY" https://api.stripe.com

# What the agent sees:
# ✅ Command executed successfully

# What actually ran:
# curl -H "Authorization: Bearer sk_live_abc123..." https://api.stripe.com

The agent orchestrates. It knows which secret to use. But it never sees the value.

How It Works

┌───────────────────────────────────────────────────────┐
│  Agent Context                                        │
│                                                       │
│  "I need to call Stripe API"                          │
│  > psst STRIPE_KEY -- curl https://api.stripe.com     │
│                                                       │
│  [Command executed, exit code 0]                      │
│                                                       │
│  (Agent never sees sk_live_...)                       │
└───────────────────────────────────────────────────────┘
                          │
                          ▼
┌───────────────────────────────────────────────────────┐
│  psst                                                 │
│                                                       │
│  1. Retrieve encryption key from OS Keychain          │
│  2. Decrypt STRIPE_KEY from local vault               │
│  3. Inject into subprocess environment                │
│  4. Execute command                                   │
│  5. Return exit code (not the secret)                 │
└───────────────────────────────────────────────────────┘

Secrets are encrypted at rest with AES-256-GCM. The encryption key lives in your OS keychain (macOS Keychain, libsecret on Linux). Zero friction - no passwords to type.

The Interface

Setup once:

npm install -g @pssst/cli
psst init
psst set STRIPE_KEY          # interactive prompt, value hidden
psst set OPENAI_API_KEY

Then agents just use it:

psst STRIPE_KEY -- curl https://api.stripe.com
psst AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY -- aws s3 ls
psst DATABASE_URL -- prisma migrate deploy

That's the whole API. One pattern: psst SECRET -- command.

Agent Onboarding

Run psst onboard in your project and it adds instructions to your CLAUDE.md or AGENTS.md:

## Secrets Management (psst)

Use `psst SECRET -- command` to run commands with secrets.
Never ask the user to paste secrets in chat.
If a secret is missing, ask them to run `psst set SECRET_NAME`.

It also teaches agents to shame you if you try to paste a secret in plain text. Because we all need accountability.

Local-First, Agent-First

No cloud. No sync. No account. Your secrets stay on your machine, encrypted, accessible only through the keychain.

The first customer is the agent. The interface is designed for non-human use. Humans just set things up and let the agent work.

Try It

npm install -g @pssst/cli
psst init
psst set MY_SECRET
psst MY_SECRET -- echo "The secret is $MY_SECRET"

Code: github.com/Michaelliv/psst

psst 🤫 — because your agent doesn't need to know your secrets.

The Agent Harness

Michael Livshits — Sat, 20 Dec 2025 00:00:00 GMT

Yesterday I wrote about context engineering needing an engine. The feedback was clear: the framing didn't land. "Context engineering" is too abstract. People nodded politely and moved on.

Let me try again with a different frame: the agent harness.

What Frameworks Don't Define

Every agent framework gives you the same thing: a loop. Call the model, parse tool calls, execute tools, feed results back, repeat. LangChain, CrewAI, Vercel AI SDK, raw API calls - they all nail this part.

But here's what they leave undefined:

When does the agent stop? Frameworks offer maxSteps and stopConditions, but they're isolated from conversation state. Stopping based on what's been tried, what's failed, what's accumulated? Glue code.
What context gets injected where? System message, user message, tool response - all valid injection points. No standard approach.
How do tool outputs render? UIs want JSON. Models want markdown or XML or prose. Your problem.
How do you enforce tool behaviors? "Always read before edit." "Confirm before delete." "Compact context when it gets long." Roll your own.
How do you remind the model of constraints? Inject into every message? Only on certain triggers? Hope it remembers?

These aren't edge cases. They're the difference between an agent that works and one that spirals.

Injection Points

Every conversation has the same shape:

┌─────────────────────────────────────────────────────────┐
│ SYSTEM MESSAGE                                          │ ← injection point
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ USER MESSAGE                                            │ ← injection point
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ ASSISTANT                                               │
│   ┌───────────────────────────────────────────────────┐ │
│   │ Tool Call                                         │ │
│   └───────────────────────────────────────────────────┘ │
│   ┌───────────────────────────────────────────────────┐ │
│   │ Tool Response                                     │ │ ← injection point
│   └───────────────────────────────────────────────────┘ │
│   ... more calls ...                                    │
│   Final response                                        │
└─────────────────────────────────────────────────────────┘

These are the places where you can inject context. Frameworks define how messages flow. The harness defines what gets injected at each point, when, and why.

The Harness

Seven behaviors that need definition:

1. Tool Output Protocol

Tools serve two consumers: UIs and models. UIs want structured JSON for rendering. Models want whatever format aids comprehension.

┌─────────────────────────────────────────┐
│ Structured Data (JSON)                  │  → for UIs, logging, debugging
├─────────────────────────────────────────┤
│ Model Rendering                         │  → format optimized for LLM
├─────────────────────────────────────────┤
│ Attached Reminders                      │  → context to inject with result
└─────────────────────────────────────────┘

One tool output, multiple renderings. The protocol defines how they're bundled.

2. Conversation State

Treat conversation history as queryable state. Not just a list of messages - an event stream with views.

How many times has this tool failed?
What has the model already tried?
How much context has accumulated?
Is the model stuck in a loop?

Views over the stream, not scattered bookkeeping.

3. System Reminders

Context that gets injected at injection points. Three levels:

System-level: Seed the system message with awareness that reminders exist. Include a few-shot example so the model knows the format and pays attention. "You will receive <system-reminder> tags with context. Here's an example..."

Message-level: Reminders that attach to user messages or tool responses. "Remember to validate file paths." "You have 3 tools available for this task."

Tool-level: Reminders bound to specific tools. When write_file is called, inject "never import in the middle of a file." Only surfaces when relevant.

4. Stop Conditions

When does the agent stop? Define it explicitly:

Turn limit: Stop after N turns
Token budget: Stop when context exceeds threshold
Task completion: Stop when a condition is met (model says done, specific output detected)
Error threshold: Stop after N consecutive failures
Custom rules: Any condition over conversation state

Without explicit stop conditions, agents run until they hit API limits or spiral into nonsense.

5. Tool Enforcement Rules

Rules that govern tool behavior:

Sequencing: "Always read a file before editing it"
Confirmation: "Confirm with user before deleting files"
Rate limiting: "Max 3 retries per tool per turn"
Auto-actions: "When context exceeds 80%, trigger compaction"

These aren't suggestions to the model. They're enforced by the harness.

6. Injection Queue

Reminders accumulate. A queue manages them:

Prioritization (safety reminders first)
Batching (group related context)
Deduplication (don't repeat yourself)

When an injection point arrives, the queue flushes strategically.

7. Hooks

Plugin system for everything. Custom stop conditions? Hook. Custom rendering? Hook. Custom injection logic? Hook.

The harness provides structure. Hooks provide flexibility.

Why "Harness"

A harness guides without replacing. It wraps the agent loop, observes the conversation, enforces rules, injects context. The agent still does the work. The harness keeps it on track.

┌─────────────────────────────────────────────────────────┐
│                    Agent Framework                      │
└─────────────────────┬───────────────────────────────────┘
                      │ conversation
                      ▼
┌─────────────────────────────────────────────────────────┐
│                    Agent Harness                        │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌─────────┐  │
│  │  State   │→ │  Rules   │→ │  Queue   │→ │Renderer │  │
│  └──────────┘  └──────────┘  └──────────┘  └─────────┘  │
└─────────────────────┬───────────────────────────────────┘
                      │ enriched context
                      ▼
┌─────────────────────────────────────────────────────────┐
│                      LLM API                            │
└─────────────────────────────────────────────────────────┘

The goal: framework-agnostic. Should work with LangChain, CrewAI, Vercel AI SDK, or raw API calls.

I'm building this. The spec is at github.com/Michaelliv/agent-harness. An AI SDK implementation is underway at github.com/Michaelliv/agent-harness-ai-sdk.

Star it, open an issue, or tell me why I'm wrong.

Context Engineering Has No Engine

Michael Livshits — Fri, 19 Dec 2025 00:00:00 GMT

"Context engineering" is having a moment. Everyone's talking about what context to feed their agents. Almost no one is talking about the engineering part.

We obsess over which documents to retrieve, which examples to include, which instructions to prepend. But the mechanics of injection? Duct tape. Strings concatenated to system prompts. Tool results appended and forgotten. Context management that doesn't manage anything.

The discipline needs definition. Everyone says "context engineering" but nobody specifies what that actually means. Here's what I think it is.

The Shape of Every Conversation

┌─────────────────────────────────────────────────────────┐
│ SYSTEM MESSAGE                                          │ ← injection point
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ USER MESSAGE                                            │ ← injection point
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ ASSISTANT                                               │
│   ┌───────────────────────────────────────────────────┐ │
│   │ Tool Call                                         │ │
│   └───────────────────────────────────────────────────┘ │
│   ┌───────────────────────────────────────────────────┐ │
│   │ Tool Response                                     │ │ ← injection point
│   └───────────────────────────────────────────────────┘ │
│   ... more calls ...                                    │
│   Final response                                        │
└─────────────────────────────────────────────────────────┘

Every conversation has this shape. Frameworks define how the tool loop works - calling, parsing, error handling. But context injection points? Undefined. How is the system message rendered? Can you inject context into user messages? Into tool responses? Between calls?

Nobody specifies this. Some developers discover it, then hack something together.

Here's what a proper specification would include:

Renderable Context Components

Tools serve two consumers: UIs and models. UIs want structured JSON. Models want whatever format aids comprehension - markdown tables, XML tags, prose. Today these are conflated.

A tool output protocol separates them:

┌─────────────────────────────────────────┐
│ Protocol Version                        │
├─────────────────────────────────────────┤
│ Structured Data (JSON)                  │  → for UIs, logging, debugging
├─────────────────────────────────────────┤
│ Model Rendering                         │  → format optimized for LLM
├─────────────────────────────────────────┤
│ System Reminders                        │  → context to inject with result
└─────────────────────────────────────────┘

Some frameworks already feel toward this. Vercel's AI SDK has toModelOutput - a function that converts tool results to model-friendly format. But it's a one-off. There's no protocol, no standard way to attach reminders, no composability.

Renderable context components formalize this. The tool returns structured data. A renderer converts it to model format. Reminders attach as metadata. Components compose - a <CodeContext> contains <File> components, each containing <Function> components. Same data, multiple renderings.

Queryable Conversations

Treat conversation history as an event stream. Every interaction is an event: messages, tool calls, results, failures. Append-only, immutable.

The power is in the views. Materialized projections over the stream that answer questions: What tools have failed, and how many times? What has the model already tried? What entities have been mentioned? Is the model stuck in a loop?

Views are derived from the stream, can be rebuilt anytime, and replace scattered imperative bookkeeping with declarative queries.

Reactive Injection

Given queryable conversations, we can define rules that trigger context injection. Two flavors:

State-based: Rules that fire when conversation state matches a condition - consecutive failures, topic shift, context window pressure. "You've tried this approach twice. Consider an alternative."

Tool-bound: Rules attached to tools that fire with tool results. The write_file tool carries a reminder to validate paths. Only surfaces when that tool is called.

Injection Queue

Reminders accumulate between injection points. A queue manages them: prioritization, batching, deduplication. When an injection point arrives, the queue flushes strategically. High-priority safety reminders first. Contextual hints batched together. The queue is the traffic controller.

Hookable Architecture

Plugin system for everything. Custom rule definitions? Hook. Custom rendering? Hook. Custom injection strategy? Hook. The core provides primitives, not opinions. Developers implement their own interaction patterns through hooks.

The Engine

The engine sits alongside agent execution, not inside it. Middleware that observes the conversation stream, maintains state, and injects context at boundaries. Framework-agnostic. It doesn't care if you're using LangChain, CrewAI, Claude's tool use, or raw API calls.

┌─────────────────────────────────────────────────────────┐
│                    Agent Framework                      │
└─────────────────────┬───────────────────────────────────┘
                      │ conversation messages
                      ▼
┌─────────────────────────────────────────────────────────┐
│                   context-engine                        │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌─────────┐  │
│  │  Event   │→ │   Rule   │→ │  Queue   │→ │Renderer │  │
│  │  Store   │  │  Engine  │  │ Manager  │  │         │  │
│  └──────────┘  └──────────┘  └──────────┘  └─────────┘  │
└─────────────────────┬───────────────────────────────────┘
                      │ enriched context
                      ▼
┌─────────────────────────────────────────────────────────┐
│                      LLM API                            │
└─────────────────────────────────────────────────────────┘

The processing model is unified: rule engine, context accumulation, injection. Whether you're injecting based on a user message keyword or a tool failure pattern, the machinery is the same.

If this resonates, I'm building it: github.com/Michaelliv/context-engine. Star it, open an issue, or tell me why I'm wrong.

Tool Design is All About the Flow

Michael Livshits — Mon, 15 Dec 2025 00:00:00 GMT

Your tools aren't capabilities you give the model. They're waypoints that shape how it thinks.

Most agent failures come from too much freedom. You dump context in, ask for output, and hope for the best. The model has to figure out what it needs, retrieve it mentally, reason through it, and produce an answer. All in one shot. That's a lot of cognitive load for a single completion.

The fix isn't just better prompts. It's designing the flow.

Here's a pattern that works: Search → View → Use.

Search returns summaries: titles, snippets, metadata. Not full content. The model sees candidates but can't access details yet.

View loads the full content of something the model explicitly chose. Tokens only enter context when the model decides they're needed.

Use commits a piece of information to the output. Use is an explicit decision point—your system can trigger follow-up actions when something gets Used, not just viewed. Some components might require follow-up actions when used. This is where you wire that logic.

This is progressive disclosure for agents. Smaller context means less noise for the model to filter, and explicit retrieval steps create natural checkpoints for reasoning. It works in UX. It works in Claude Code (skills load context only when invoked). And it works for tool design.

This forces the model through a deliberate sequence: discover, inspect, commit. Context stays lean. Reasoning becomes auditable. You can trace exactly what the model looked at and what it decided to use.

A code assistant searches functions, views implementations, then Uses the ones it references. Context stays minimal until needed.

The deeper principle: you're turning a generation problem into a navigation problem. Instead of asking the model to hold everything in its head and produce an answer, you give it a map to traverse. The tools are the terrain. The model's job becomes navigation and assembly, not memorization and inference.

The Search/View/Use pattern is most obvious in retrieval workflows, but the principle extends anywhere you can break "do everything at once" into staged decisions.

This doesn't cure all agent problems. You still need to reinforce the flow in your system message and guardrail against bad behavior. Don't let the model edit a file it hasn't read. Don't let it answer before it searches. The tools create the path, but you need to keep the model on it.

Constrained flow beats open freedom every time.

Reverse-engineering Claude's sandbox, then building my own

Michael Livshits — Sat, 29 Nov 2025 12:00:00 GMT

A few weeks ago, Anthropic gave Claude filesystem access. If you've used claude.ai recently, you've seen it - Claude can now write files, run Python, execute shell commands.

This wasn't just a feature. It was a bet on how agents should interact with the world.

If you're building an agent, you have two paths. Path one: tools. Want the agent to query a database? Build a tool. Search logs? Another tool. Transform data? Tool. Each one needs a schema, validation, error handling. Five actions means five tools. It doesn't scale.

Path two: give it a terminal. A bash shell is a meta-tool. One interface, infinite capability. The agent inherits everything the OS offers - Python, grep, awk, curl, the entire unix toolkit. Training data is abundant. The mental model is universal.

Anthropic chose path two. But if you give an agent unlimited OS access, you have a problem: containment. The agent can run arbitrary code. That code might be malicious, buggy, or just resource-hungry.

I was building an agent backend and needed to solve this same problem. Before writing any code, I wanted to see how Anthropic does it.

Peeking inside Claude's sandbox

Here's the thing about reverse-engineering Claude's sandbox: Claude is the best tool for the job. I can just ask it to inspect its own environment.

This revealed more than I expected.

Network control via egress proxy. Instead of disabling network entirely, all traffic routes through a proxy that validates JWTs. The token contains an allowlist of hosts (package registries, GitHub, Anthropic API) and expires in 4 hours. Claude has network access - it's just tightly controlled.

A custom init process. PID 1 isn't a shell - it's /process_api, a purpose-built binary that receives commands and enforces resource limits at the application layer.

Running as root inside the sandbox. This surprised me. gVisor's isolation is strong enough that they don't bother with a non-root user.

What I expected	What I found
No network	JWT-authenticated egress proxy
Shell as PID 1	Custom `/process_api` binary
Non-root user	Root (uid=0)

The image is ~7GB with ffmpeg, ImageMagick, LaTeX, Playwright, LibreOffice - everything for file processing. For my use case, a minimal ~200MB image is enough.

The options

Firecracker is what AWS uses for Lambda. MicroVMs that boot in ~125ms with ~5MB memory overhead. True VM-level isolation. The catch: it needs direct KVM access. Standard Kubernetes nodes are themselves VMs - Firecracker won't run inside them without bare metal instances. Operationally complex.

gVisor intercepts syscalls in userspace. Your container gets its own "kernel" - really a Go program pretending to be a kernel. It works anywhere Docker runs. Google uses this for Cloud Run and GKE Sandbox. Simpler to operate, slightly more syscall overhead.

Plain Docker shares the kernel with the host. Container escapes are rare but real. For untrusted code, that's not enough.

Anthropic chose gVisor. So did I.

The sandbox image

First, what goes in the container:

FROM python:3.13-slim-bookworm

RUN apt-get update && apt-get install -y --no-install-recommends \
    coreutils grep sed gawk findutils \
    curl wget git jq tree vim-tiny less procps \
    && rm -rf /var/lib/apt/lists/*

RUN pip install --no-cache-dir aiohttp

RUN mkdir -p /mnt/user-data/uploads \
             /mnt/user-data/outputs \
             /workspace

COPY process_api.py /usr/local/bin/process_api

WORKDIR /workspace

EXPOSE 2024

CMD ["/usr/local/bin/process_api", "--addr", "0.0.0.0:2024"]

Python, standard unix utils, and a directory structure that mirrors Claude's. The key addition is process_api - an HTTP server that runs as PID 1 and handles command execution. No non-root user - gVisor provides the isolation boundary, not Linux permissions.

Container lifecycle

Three options for when containers live and die:

Pre-warmed pool: Keep N containers running idle, grab one when needed. ~10-50ms latency. But you're managing a pool, handling assignment, dealing with cleanup. Complex.

Per-execution: New container for each command. Simplest code. ~600ms-1.2s cold start every time. Too slow.

Session-scoped: Container lives for the user session. Cold start once, then instant for every subsequent execution.

I went with session-scoped. The initial cold start (~500ms) hides behind LLM inference anyway - users are already waiting for the agent to think. By the time it responds, the container is warm.

class SandboxManager:
    def __init__(
        self,
        image_name: str = "agentbox-sandbox:latest",
        runtime: str = "runsc",
        storage_path: Optional[Path] = None,
        proxy_host: Optional[str] = None,
        proxy_port: int = 15004,
    ):
        self.docker_client = docker.from_env()
        self.image_name = image_name
        self.runtime = runtime
        self.storage_path = storage_path
        self.proxy_host = proxy_host
        self.proxy_port = proxy_port
        self.sessions: dict[str, SandboxSession] = {}

    async def create_session(
        self,
        session_id: str,
        tenant_id: Optional[str] = None,
        allowed_hosts: Optional[list[str]] = None,
    ) -> SandboxSession:
        # Default allowed hosts for pip, npm, git
        hosts = allowed_hosts or ["pypi.org", "files.pythonhosted.org", "github.com"]

        # Create tenant storage if configured
        volumes = {}
        if tenant_id and self.storage_path:
            tenant_dir = self.storage_path / tenant_id
            (tenant_dir / "workspace").mkdir(parents=True, exist_ok=True)
            (tenant_dir / "outputs").mkdir(parents=True, exist_ok=True)
            volumes = {
                str(tenant_dir / "workspace"): {"bind": "/workspace", "mode": "rw"},
                str(tenant_dir / "outputs"): {"bind": "/mnt/user-data/outputs", "mode": "rw"},
            }

        # Generate proxy URL with JWT-encoded allowlist
        proxy_url = self._generate_proxy_url(session_id, tenant_id, hosts)

        container = self.docker_client.containers.run(
            self.image_name,
            detach=True,
            name=f"sandbox-{session_id[:8]}",
            runtime=self.runtime,
            mem_limit="4g",
            cpu_period=100000,
            cpu_quota=400000,  # 4 CPUs
            security_opt=["no-new-privileges"],
            ports={"2024/tcp": None},  # Map process_api port
            environment={
                "HTTP_PROXY": proxy_url,
                "HTTPS_PROXY": proxy_url,
            },
            volumes=volumes,
        )

        session = SandboxSession(session_id, container, tenant_id, hosts)
        self.sessions[session_id] = session
        return session

The key insight from Claude's architecture: network isn't disabled, it's controlled. All traffic routes through an egress proxy that validates requests against an allowlist.

Defense in depth

Four layers of isolation:

gVisor runtime - The primary boundary. Syscalls are intercepted by a userspace kernel written in Go. Even if code escapes the container, it's running against gVisor, not your host. This is why Claude can run as root - "root" inside gVisor has no privileges outside it.

Egress proxy with allowlist - All outbound traffic routes through a proxy that validates requests. The sandbox can reach pypi.org, github.com, npm - but nothing else. No exfiltration to arbitrary hosts. The proxy authenticates requests with short-lived JWTs that encode the allowed hosts.

Resource limits - 4GB memory, 4 CPUs. A runaway process can't starve the host. The init process can enforce additional limits at the application layer.

Filesystem mounts - Only /workspace and /mnt/user-data/outputs are writable. User uploads mount read-only. The sandbox can't modify its own image or persist changes outside designated paths.

The egress proxy

The egress proxy is the clever part of this architecture. Instead of disabling network and dealing with the pain of pip install, you control where traffic can go.

The proxy validates each request against an allowlist encoded in a JWT:

def _generate_proxy_url(
    self,
    session_id: str,
    tenant_id: Optional[str],
    allowed_hosts: list[str],
) -> str:
    """Generate proxy URL with JWT-encoded allowlist."""
    payload = {
        "iss": "sandbox-egress-control",
        "session_id": session_id,
        "tenant_id": tenant_id,
        "allowed_hosts": ",".join(allowed_hosts),
        "exp": int((datetime.now(timezone.utc) + timedelta(hours=4)).timestamp()),
    }

    # Sign with HMAC-SHA256
    header_b64 = base64.urlsafe_b64encode(json.dumps({"typ": "JWT", "alg": "HS256"}).encode()).rstrip(b"=").decode()
    payload_b64 = base64.urlsafe_b64encode(json.dumps(payload).encode()).rstrip(b"=").decode()
    signature = hmac.new(self.signing_key.encode(), f"{header_b64}.{payload_b64}".encode(), hashlib.sha256).digest()
    signature_b64 = base64.urlsafe_b64encode(signature).rstrip(b"=").decode()

    token = f"{header_b64}.{payload_b64}.{signature_b64}"
    return f"http://sandbox:jwt_{token}@{self.proxy_host}:{self.proxy_port}"

The proxy (a simple HTTP CONNECT proxy with JWT validation) checks each request:

async def handle_connect(self, request: web.Request) -> web.StreamResponse:
    """Handle HTTPS CONNECT requests."""
    target = request.path_qs  # host:port
    host, port = target.rsplit(":", 1) if ":" in target else (target, 443)

    # Extract and verify JWT from Proxy-Authorization header
    allowed_hosts = self._get_allowed_hosts(request)

    if not self._is_host_allowed(host, allowed_hosts):
        return web.Response(status=403, text=f"Host not allowed: {host}")

    # Connect to target and pipe data bidirectionally
    reader, writer = await asyncio.open_connection(host, int(port))
    # ... bidirectional pipe between client and target

This solves the pip problem elegantly. The agent can pip install requests because pypi.org is in the allowlist. But it can't exfiltrate data to evil.com.

Streaming output

Users want to see output as it happens, not wait for completion. Each container runs process_api as PID 1 - an HTTP server that handles command execution. For streaming, it uses Server-Sent Events:

async def exec_stream(
    self,
    session_id: str,
    command: str,
    workdir: str = "/workspace",
) -> AsyncIterator[dict]:
    """Execute a command and stream output via process_api SSE."""
    session = self.sessions.get(session_id)
    if not session:
        yield {"type": "error", "data": "Session not found"}
        return

    async with httpx.AsyncClient() as client:
        async with client.stream(
            "POST",
            f"{session.api_url}/exec/stream",
            json={"command": command, "workdir": workdir},
        ) as response:
            async for line in response.aiter_lines():
                if line.startswith("data: "):
                    yield json.loads(line[6:])

The init process inside the container handles the actual execution and streams stdout/stderr as SSE events. This is the same pattern Claude uses - PID 1 is a purpose-built binary that spawns shells for each command.

What it looks like from inside

Benchmarks

Cold start under 500ms median - faster than I expected. The p95 of ~600ms is the outlier you hit on first run when layers aren't cached. Command execution at 3.5ms median is negligible. Memory overhead of 25MB per session means you can run ~40 concurrent sessions per GB of RAM.

The interesting number is concurrent scaling: latency increases from 9ms to 13ms as you go from 5 to 10 sessions. Linear enough that you won't hit a wall.

Trade-offs I accepted

No container pooling. Pre-warmed pools give you ~10-50ms latency instead of ~500ms. But session-scoped is simpler and the cold start hides behind LLM inference. I'll add pooling when latency actually becomes a problem.

No snapshot/restore. Firecracker can snapshot a running VM and restore in 5-25ms. gVisor doesn't support this. If I ever need sub-second container startup, I'll revisit Firecracker and accept the operational complexity.

Egress proxy is a separate process. The JWT-based proxy runs alongside your application. For a simple setup, network_mode: none is easier. But it's worth it - agents that can't pip install are significantly less useful.

gVisor's syscall overhead. Some workloads see 2-10x slowdown on syscall-heavy operations. For "run Python scripts and shell commands" this is negligible. For high-frequency I/O, you'd notice.

No GPU support. gVisor has experimental GPU passthrough, but I haven't needed it. When I do, this gets more complicated.

The punchline

Firecracker is technically superior. Faster boot, true VM isolation, snapshot/restore. But it requires KVM access, which means bare metal or nested virtualization. For most teams running on standard cloud infrastructure, that's a non-starter.

gVisor is the practical choice. It works in standard Kubernetes, standard Docker, anywhere containers run. Google trusts it for Cloud Run. Anthropic trusts it for Claude. The isolation is strong enough to run as root inside the sandbox.

The pattern I learned from reverse-engineering Claude's sandbox: gVisor as the hard security boundary, an egress proxy for network control instead of disabling it entirely, and session-scoped containers that hide cold start behind LLM inference latency.

If you're building agents that execute code, you need something like this. The alternative - running untrusted code on your host - is not an option.

The code is available at github.com/Michaelliv/agentbox.

Embedding Claude Code sessions in blog posts

Michael Livshits — Fri, 28 Nov 2025 17:00:00 GMT

I wanted a way to share Claude Code sessions in blog posts. Not screenshots. Not copy-pasted text. The actual terminal experience - dark background, tool calls, the whole thing.

This post is about building that. And it uses the component it describes.

The idea

I was setting up this blog with Claude Code when the thought hit: what if I could embed these sessions directly?

Custom component it is.

Finding the format

Claude Code stores sessions locally. I didn't know the format, so we went looking.

JSONL. One JSON object per line. Each message has a type (user/assistant), content blocks for text, tool calls, and thinking. Clean enough to parse.

Building it

Two files:

parseSession.ts - Reads the JSONL, filters out metadata, deduplicates streamed messages
ChatSession.astro - Renders the parsed messages with terminal styling

The component accepts a session prop in three formats: typed arrays (cleanest), imported JSONL files, or inline strings.

The result

Usage with typed arrays (recommended):

What's next

The component is basic. Could add:

Actual tool output (file contents, command results)
Collapsible long outputs
Syntax highlighting in code blocks

But it works. I can now embed real sessions, not reconstructions.

Anatomy of agentic systems

Michael Livshits — Fri, 28 Nov 2025 14:00:00 GMT

I'll be writing a lot about LLMs and agentic systems here. Before diving into the weeds, it's worth laying out the basic anatomy.

The ingredients

An agentic system has a few core components:

The LLM. The reasoning engine. It takes context in, produces actions or text out. It doesn't remember anything between calls - every invocation starts fresh.

The loop. The agent runs in a loop: observe, think, act, repeat. The loop is what makes it "agentic" rather than just a single prompt-response.

Tools. Functions the agent can call to affect the world - read files, make API calls, run code. Without tools, the agent can only talk.

Context window. Everything the model can see at once. This is your working memory. It fills up fast.

System prompt. The instructions that shape behavior. This is where you define who the agent is and how it should act.

That's it. Everything else is scaffolding around these five things.

The pulls and levers

When an agent misbehaves, you have a few places to intervene:

System prompt. The most obvious lever. You can add rules, examples, constraints. But there's a catch: instructions compete for attention. The more you add, the less weight each one carries.

Tool design. The shape of your tools guides behavior more than you'd think. What you name them, what parameters you expose, what you leave out - these all steer the agent. A well-designed tool makes the right action obvious.

Context injection. You can inject information into the conversation at runtime. Reminders, state summaries, retrieved documents. This is how you keep the agent on track as the conversation grows.

Structured output. You can constrain what comes out. Enforce schemas, reject malformed responses, guarantee valid JSON. This is your last line of defense.

Temperature and sampling. Lower temperature means more deterministic outputs. Sometimes you want creativity, sometimes you want reliability.

Stickiness

Here's something that surprises people: LLM behavior is sticky.

Once a pattern establishes itself in a conversation, it tends to persist. If the agent starts being verbose, it stays verbose. If it adopts a particular approach to a problem, it keeps using that approach even when it stops working.

This happens because the model's own outputs become part of its context. It's literally learning from itself, in real-time, within the conversation.

This cuts both ways. Bad patterns stick. But so do good ones. If you can get the agent into a good rhythm early, it tends to stay there.

The first few turns of a conversation matter more than the later ones.

What this means in practice

Building agentic systems is mostly about managing these dynamics. You're not programming in the traditional sense. You're shaping behavior through constraints and context.

The craft is in knowing which lever to pull when.

More on specific techniques in future posts.

hi chat, dev here

Michael Livshits — Fri, 28 Nov 2025 01:30:00 GMT

This is a meta post.

Right now, as I write this, I'm in a Claude Code session. Claude is helping me set up this blog, and now we're writing the first post together. That feels worth acknowledging.

The purpose of this blog is simple: a personal log for learnings and thoughts as I build and break stuff. No grand vision, no content strategy. Just notes from the trenches.

This blog is intended for builders. If you're here, you probably make things. You've probably broken things too. That's the territory.

More posts to come as I learn things worth sharing.