Build, Break, Repeat
Read more about: agentsinfrastructureclaude-codecontext-engineeringllms
/Article

The LLM app spectrum

There’s a spectrum of what LLMs can build for people who don’t code, and nobody’s really mapped it. So let me try.

Single-file HTML

One file. Open in browser. A unit converter, a countdown timer, a color picker. The LLM produces everything - markup, styles, logic. No deployment, no dependencies, no build step. You save a .html file and double-click it.

This is the most underrated tier. It works almost every time because there’s nowhere for things to go wrong. No server, no state, no configuration. The entire application is the output.

Simon Willison calls these HTML tools and has built over 150 of them, almost all written by LLMs. That’s not a toy count. That’s a whole productivity layer built on the simplest possible format.

SPAs

Still client-side, but with real state management. A budget tracker with localStorage. A markdown editor with multiple tabs. A habit tracker that remembers your streaks. The LLM produces more code, but it’s still self-contained - no backend, no deployment pipeline.

The failure rate goes up here. Not because the code is harder, but because the LLM has to make more decisions. State shape, component structure, data persistence. More decisions, more places to break.

Constrained runtimes

This is the Artifacts model. A pre-built platform provides the runtime, the component library, auth, persistence, security - and the LLM’s job shrinks to producing a single component that runs inside it.

Google Apps Script is a constrained runtime. So are Artifacts. So is Val Town. The LLM doesn’t need to think about deployment, routing, or infrastructure. It fills a box. The box handles the rest.

This tier is more powerful than it looks, because everything the platform provides is stuff the LLM doesn’t have to get right. Every capability you bake into the runtime - a database, a KV store, file storage, auth - is a capability the LLM gets for free without having to wire it up.

Full-stack vibe-coded apps

Lovable, Bolt, Replit Agent. The term vibe coding - coined by Andrej Karpathy - captures it well: you describe what you want and the LLM scaffolds the entire application. Backend, database, auth, deployment. Maximum freedom, maximum surface area for failure.

This works surprisingly often for simple apps. It falls apart when things need to interact in ways the LLM didn’t anticipate. A webhook that needs to hit an API that needs auth that needs a secret that needs to be stored somewhere. The LLM can produce each piece, but the wiring between pieces is where it breaks.

The interesting part

The spectrum isn’t really about complexity tiers. It’s about how much of the stack is pre-solved versus LLM-generated.

A constrained runtime with a KV store, a database, and auth baked in is more powerful than a vibe-coded full-stack app - because the LLM doesn’t have to make architectural decisions. It just uses what’s there.

The move isn’t up the spectrum. It’s pulling capabilities down into the constrained runtime tier. Pre-solve more, generate less. Every piece of infrastructure you give the LLM for free is a piece it doesn’t have to get right from scratch.

The best LLM apps won’t come from models getting better at building full-stack applications. They’ll come from runtimes getting richer while keeping the LLM’s job simple.