Read more about: #agents#claude-code#llms#infrastructure#open-source

The architecture behind Claude Code's $1B run-rate

agents · tool-design · context-engineering

Claude Code hit $1B in run-rate revenue. Its core architecture? Four primitives: read, write, edit, and bash.

That sounds too simple. Most agent builders reach for specialized tools - one per object type, one per operation. They end up with dozens. Claude Code’s foundation is four primitives that compose into everything else.

The difference comes down to one asymmetry:

Reading forgives schema ignorance. Writing punishes it.

Once you see it, you can’t unsee it.

Reading is forgiving

Say you’re building an agent that needs to pull information from multiple sources. You model a few tools:

  • search(query) - find things across systems
  • get_details(id) - fetch full context on something
  • query(filters) - structured lookup

Three tools cover a lot of ground. The agent doesn’t need to know it’s hitting Slack’s API versus Jira’s REST endpoints versus your Postgres database. You abstract the differences:

  • Different APIs? Wrap them behind a unified interface.
  • Different response shapes? Normalize to a common structure.
  • Messy data? ETL your way out of it.

The agent can be naive about the underlying complexity. You absorb the mess in your infrastructure layer. Sources multiply, but your tool surface stays relatively flat.

Tractable work. Not trivial, but tractable.

Writing explodes

Now try the same approach with writes.

Here’s what a single create tool looks like:

{
  "name": "create_task",
  "parameters": {
    "type": "object",
    "required": ["title", "project_id"],
    "properties": {
      "title": {"type": "string"},
      "description": {"type": "string"},
      "project_id": {"type": "string"},
      "assignee_id": {"type": "string"},
      "status": {"enum": ["todo", "in_progress", "done"]},
      "priority": {"enum": ["low", "medium", "high", "urgent"]},
      "due_date": {"type": "string", "format": "date"},
      "labels": {"type": "array", "items": {"type": "string"}},
      "parent_task_id": {"type": "string"},
      "estimated_hours": {"type": "number"}
    }
  }
}

That’s one object. One create tool.

Now imagine your system has 10 object types: projects, tasks, users, comments, labels, attachments, workflows, notifications, permissions, integrations. Each with their own required fields, enums, and nested structures.

How many tools do you need?

  • 10 create tools (one per object type)
  • 10 update tools (schemas differ per object)
  • 1 delete tool (maybe you can share this one)

That’s 21 tools minimum. And you’re already making compromises.

Maybe you try to consolidate. Put all creates in one tool, all updates in another. Now your schema is massive - every field from every object type, most of which are irrelevant for any given call. The agent drowns in options.

Maybe you hide the schemas, let the agent figure it out. Now it guesses wrong constantly. Field names, required versus optional, valid values - all invisible, all error-prone.

And then there’s partial updates.

With reads, partial data is fine. You fetch what you need. With writes, partial updates mean modeling operations: set this field, unset that one, append to this array. You’re not just passing data anymore - you’re building a mini query language on top of your schema.

{
  "operations": [
    {"op": "set", "field": "status", "value": "done"},
    {"op": "unset", "field": "assignee"},
    {"op": "append", "field": "labels", "value": "urgent"}
  ]
}

Now multiply this by 10 object types. Your tool definitions become doctoral theses.

This is exactly what’s happening with MCP servers. Browse the ecosystem and you’ll find servers with 30, 40, 50+ tools - one for every object type, every operation, every edge case. The protocol is fine. The problem is structural: the moment you model writes as specialized tools, you’ve signed up for schema sprawl.

Reading scales with abstraction. Writing scales with domain complexity.

The more objects in your system, the more your write layer sprawls. There’s no ETL escape hatch. The agent isn’t consuming structure - it’s producing it. It needs to know the full shape, the constraints, the relationships.

There’s an escape hatch. But it requires rethinking what “write tools” even means.

The file system escape hatch

Model your writes as files.

Files are a universal interface. The agent already knows how to work with them. Instead of 21 specialized tools, you have:

  • read - view file contents
  • write - create or overwrite a file
  • edit - modify specific parts
  • list - see what exists

Four tools. Done.

The schema isn’t embedded in your tool definitions - it’s the file format itself. JSON, YAML, markdown, whatever fits your domain. The agent already understands these formats. You’re not teaching it your API; you’re leveraging capabilities it already has.

Partial updates become trivial. That same task update - status, assignee, labels - is just:

# tasks/task-123.yaml
title: Fix authentication bug
status: done          # was: in_progress
# assignee: removed
labels:
  - auth
  - urgent            # appended

The agent edits the file. No operation modeling. No schema in the tool definition. The format is the schema.

And if you have bash, everything else comes free: move, copy, diff, validate, transform.

Domain abstractions still make sense for reads. But writes? Files.

Borrow from developers

Files alone aren’t enough. You need guardrails.

Developers have been building guardrails for files for decades. Linters catch structural errors. Formatters normalize output. Static analysis catches semantic errors before they propagate. jq and yq transform and validate JSON and YAML. Schema validators enforce contracts.

The agent writes files. The tooling catches mistakes. You’ve decoupled “agent produces output” from “output is correct.”

This isn’t code-specific. Any domain with structured data can adopt this pattern.

CLI tools and progressive disclosure

What about external systems? You still need to talk to Jira, deploy to AWS, update your database.

Use CLI tools. They’re self-documenting via --help.

$ jira issue create --help

Create a new issue

Usage:
  jira issue create [flags]

Flags:
  -p, --project string     Project key (required)
  -t, --type string        Issue type: Bug, Task, Story
  -s, --summary string     Issue summary (required)
  -d, --description string Issue description
  -a, --assignee string    Assignee username
  -l, --labels strings     Comma-separated labels
      --priority string    Priority: Low, Medium, High

The agent doesn’t need your Jira schema embedded in its tools. It runs --help, discovers the interface, and uses it. Same Search → View → Use pattern that makes skills work. The agent finds the command, inspects the options, executes.

Progressive disclosure. Context stays lean until the moment it’s needed. You’re not stuffing every possible schema into the system prompt - the agent pulls what it needs, when it needs it.

This is why well-designed CLI tools are better agent interfaces than REST APIs wrapped in function calls. CLIs are designed for humans operating without full context. The --help flag exists precisely because users don’t memorize every option.

Agents have the same constraint. They work better when interfaces reveal themselves on demand.

The industry is converging on this

Vercel learned this the hard way. Their internal data agent, d0, started with heavy prompt engineering, specialized tools, and carefully managed context. It worked, but was fragile and slow.

They stripped it down. Gave the agent a bash shell and direct file access. Let it use grep, cat, and ls to interrogate data directly.

The results:

  • 3.5x faster execution
  • 100% success rate (up from 80%)
  • 37% fewer tokens
  • 42% fewer steps

“Grep is 50 years old and still does exactly what we need,” wrote Andrew Qu, Vercel’s chief of software. “We were building custom tools for what Unix already solves.”

Anthropic is pushing the same direction. Their experimental “Ralph Wiggum” setup is essentially a bash while loop - give Claude a prompt file, let it iterate on its own work, capture everything in files and git history. In one test, it completed $50,000 worth of contract work for $297 in API costs.

The pattern keeps emerging: simpler architectures, file-based state, unix primitives.

Why terminal agents work so well

This isn’t theoretical. It’s why terminal-based agents - Claude Code, Codex CLI, OpenCode, and others - are outperforming their GUI and API-wrapped counterparts.

They’re entirely file-based. Read files, write files, edit files. Run bash commands. When they need to interact with external systems - git, npm, docker, cloud CLIs - they use existing command-line tools.

No schema explosion. No tool proliferation. No operation modeling for partial updates.

The entire complexity of software engineering - millions of possible file types, frameworks, languages, configurations - handled by a handful of primitives that compose universally.

Anthropic isn’t just betting on this architecture - they’re acquiring the infrastructure to accelerate it. Their purchase of Bun, the JavaScript runtime, came alongside Claude Code hitting $1B in run-rate revenue. They’re not building custom agent tooling. They’re investing in faster file operations and CLI primitives.

Files and CLIs aren’t a workaround. They’re the architecture.