Skills, forks, and self-surgery: how agent harnesses grow

/Metadata

Every agent harness starts with the same four tools: read, write, edit, bash. How you extend that harness determines everything - safety, agency, complexity.

I’ve been studying three harnesses that take genuinely different approaches to extensibility: Claude Code, NanoClaw, and Pi. Each one makes a bet on where complexity should live - in the harness, in the wrapper, or in the agent itself.

Claude Code: composition over specialization

Claude Code extends through three mechanisms: skills (lazy-loaded instruction files), MCP (server-based tool integration), and hooks (lifecycle event handlers).

The design principle is progressive disclosure. Skills are markdown files that only load when the agent decides they’re relevant. Context stays lean until it’s needed. MCP servers add external tools without bloating the core.

Hooks are the most interesting mechanism. They fire at 17 different lifecycle events - from SessionStart to PreToolUse to Stop to WorktreeCreate. A hook can be a shell command, an LLM prompt, or a full agent with tool access that spawns to verify conditions. A PreToolUse hook can block destructive commands before they execute. A Stop hook can spawn a subagent that reads files and runs tests to verify the task is actually done before Claude finishes. They can run async in the background, match on regex patterns, and return structured decisions. This isn’t “before/after” middleware - it’s a full event system for the agentic loop.

This is a powerful combination with guardrails. You get safety rails, permissions, team coordination - but the primitives stay composable.

NanoClaw: extend the wrapper, not the harness

NanoClaw can’t extend Claude Code directly. Claude Code is closed source. That constraint forced an interesting solution: extend the layer around the harness instead. You get no actual control over the harness itself, but since NanoClaw runs Claude Code in a container, it supports everything Claude Code supports - skills, MCP, hooks, all of it.

NanoClaw is roughly 500 lines of TypeScript that manages containers, messaging, IPC, and task scheduling. When you run /add-telegram, it doesn’t load a plugin. It teaches Claude Code how to rewrite src/channels/telegram.ts in the wrapper itself.

The extension model is fork-first. You fork, you diverge, your fork becomes uniquely yours. Contributions aren’t PRs - they’re skills that describe transformations. The wrapper is small enough (~500 lines) that Claude Code can reliably modify the entire orchestration layer in one shot.

IPC is filesystem-based. Write JSON to data/ipc/{folder}/messages/, the wrapper polls every second. No gRPC, no message queues. Debuggable with cat.

This is the “malleable core” bet. The harness is fixed (Claude Code in a container), so you make the wrapper trivial enough to regenerate.

Pi: the agent extends itself

Pi takes the most radical position. It shares the same base tools as Claude Code - read, write, edit, bash - and supports skills (on-demand instruction files, similar to Claude Code’s approach) and hooks (lifecycle event handlers for the bash tool and extensions). But it deliberately excludes MCP. By design.

The rationale: popular MCP servers dump 13-18k tokens of tool descriptions into context on every session. Pi’s extension model is CLI tools and skills. But Pi also supports TypeScript extensions as native tools (actual code execution), unlike Claude Code’s MCP approach which requires external server processes. Need a new capability? Build a CLI tool or skill, or write a TypeScript extension that executes directly in-process. The harness stays minimal - shortest system prompt, least cognitive load on the model.

This is the “trust the model” bet. Maximum agency, minimum harness. If the model is good enough, the harness should get out of the way.

The tradeoff axis

These three systems sit on a spectrum.

Safety / Control
Agent Agency
Claude Code
structured extensions
NanoClaw
container isolation
Pi
agent self-extends

Claude Code gives you the most structure. Pi gives the agent the most freedom. NanoClaw splits the difference - OS-level container isolation for safety, but radical malleability in the wrapper.

	Claude Code	NanoClaw	Pi
Extension model	Skills + MCP + Hooks + Plugins	Fork and modify wrapper source	Agent writes TypeScript at runtime
Safety approach	Sandboxing + permissions + hooks	OS-level containers	Trust the agent
Context strategy	Progressive disclosure	Wrapper manages context	Progressive disclosure + agent decides what it needs

The convergence

Here’s what’s interesting: all three have package ecosystems - Claude Code has a plugin marketplace with integrations from Stripe, Figma, and Sentry, Pi has packages on npm and pi.dev/packages, NanoClaw has skills - but they all converge on the same underlying architecture. Files and CLIs. Not frameworks, not dependency injection. Files you can read with cat and tools you can run from bash.

Claude Code uses files as the universal interface. NanoClaw uses filesystem IPC. Pi forces the agent to create its own tools as files.

The extension philosophies differ, but the substrate is the same. Reduce harness complexity, increase agent surface area. The winning architecture looks like Unix, not like a framework.

For more on this philosophy, see how tool design affects agent flow.

The question isn’t which approach is “right.” It’s which tradeoff matches your trust model. Are you building a tool for engineers who want control? A personal assistant that adapts to one user? A research platform that pushes model capabilities?

The harness should reflect that answer. Nothing more.