System reminders - how Claude Code steers itself
Steering an agent is the act of reinforcing good behaviors and discouraging bad ones. Most recent models are amazing at agentic work, but LLMs are non-deterministic by nature, which means they’re also easy to confuse. If security has defense in depth, then long agentic flows require instructions in depth. Anything you write in a system prompt is basically gone in a conversation of 100K tokens. Different harnesses steer differently, and how well they do it is one of the most important parts of harness engineering.
Steering comes in a few forms. The system message is the first layer, modeling the world and the task at hand. Everything that follows is colored by its semantics. User messages are the highest-attention channel, you type something, the model prioritizes it. Queued messages are a product trick, you write a message while the model is working and the harness slips it in between tool calls. Tool responses can piggyback instructions on results as well. Each of these carries a different “trust” level in the model’s eyes. This matters.
Trust hierarchy
Large language models are trained to treat these sources differently. System prompts and user messages get high attention. Tool responses are treated as external information - potentially adversarial.
This is the right call. If models followed instructions in tool responses the same way they follow system prompts, prompt injection would be trivial. Read a file containing “ignore all previous instructions” and the agent complies? No. Models are trained to be skeptical of tool outputs.
So where do you inject steering that the model will actually follow? The user message slot. That’s where the model pays serious attention. Most harnesses “hook” steerability into the system prompt - they semantically link a section in the prompt to content that will appear later in the conversation. A good way to do this is with specific tags that push the model’s attention back to the system message. “You’ll receive <system-reminder> tags, follow them.” When the model sees that tag mid-conversation, it connects back to the system prompt’s instructions. It knows these aren’t adversarial. They’re from the harness, not from some random file it read.
Nudging over forcing
The instinct is to force correct behavior. Hard blocks, strict validation, deterministic guardrails. But practitioners consistently overestimate their ability to model the desired behavior across the infinite space of user prompts, agent actions, and conversation states. You can’t anticipate every path through a hundred-step session. Nudging beats forcing because nudges compose with the model’s own judgment instead of replacing it.
There’s a natural layering to how you shape agent behavior. The system prompt seeds it - this is who you are, these are your constraints. Tools shape the path - the available actions define what the model can do, and good tool design guides it toward solutions. But what about behaviors that aren’t constraints and aren’t tool choices? Preferences. Tendencies you want to reinforce.
Say your agent tends to skim files - reading small chunks when it should read the whole thing. You don’t want to force full reads every time, because sometimes a partial read is the right call. But it’s a preference. You want to catch the pattern and remind the model. That’s the gap system reminders fill - reinforcing behaviors for known issues without hardcoding them as rules.
System reminders
As far as I can tell, Anthropic’s Claude Code team were the first to ship this pattern. They’re reactive messages injected based on conversation state - not periodic, not static. Something happens, a condition is met, a reminder fires. Sometimes they fire as a response to something the user did in the UI. Sometimes they fire as a side effect of the system - token consumption getting high, context growing large.
They look like this:
<system-reminder>
Warning: the file exists but the contents are empty.
</system-reminder>
Harness lifecycle
Harnesses have lifecycle events - hooks that fire at specific points during the agent’s execution. Here’s how pi’s event system lays it out:
session_start
│
user sends prompt
│
├─► input
├─► before_agent_start
├─► agent_start
│
│ ┌─── turn loop ───────────────────────────────┐
│ │ │
│ ├─► turn_start │
│ ├─► context │
│ ├─► before_provider_request │
│ │ │
│ │ LLM responds: │
│ │ ├─► message_start │
│ │ ├─► message_update (streaming) │
│ │ └─► message_end │
│ │ │
│ │ Tool execution (per tool call): │
│ │ ├─► tool_execution_start │
│ │ ├─► tool_call │
│ │ ├─► tool_execution_update │
│ │ ├─► tool_result │
│ │ └─► tool_execution_end │
│ │ │
│ └─► turn_end │
│
└─► agent_end
session events (anytime):
session_compact / session_switch / session_fork
model_select / session_shutdown
Each of these is a potential evaluation point for system reminders. After a tool fails? Check if it’s the third failure in a row. Before a turn starts? Check if context is too large. After compaction? Remind the model that file contents may have been summarized away.
Not every event makes sense for reminder evaluation. Here’s the same diagram, annotated with where reminders should fire:
session_start ◄── evaluate
│
user sends prompt
│
├─► input
├─► before_agent_start
├─► agent_start ◄── evaluate
│
│ ┌─── turn loop ───────────────────────────────┐
│ │ │
│ ├─► turn_start ◄── evaluate
│ ├─► context
│ ├─► before_provider_request
│ │ │
│ │ LLM responds: │
│ │ ├─► message_start ◄── evaluate
│ │ ├─► message_update ◄── evaluate
│ │ └─► message_end ◄── evaluate
│ │ │
│ │ Tool execution (per tool call): │
│ │ ├─► tool_execution_start ◄── evaluate
│ │ ├─► tool_call ◄── evaluate
│ │ ├─► tool_result ◄── evaluate
│ │ └─► tool_execution_end ◄── evaluate
│ │ │
│ └─► turn_end ◄── evaluate
│
└─► agent_end ◄── evaluate
session events (anytime):
session_compact ◄── evaluate
session_switch ◄── evaluate
session_fork ◄── evaluate
model_select ◄── evaluate
The lifecycle gives you granularity. Not “check everything every turn” - check the right thing at the right moment.
The conversation branch is an event log. Every tool call, every result, every message - it’s all there. You can treat it as a queryable stream: filter for bash tool results, count consecutive errors, check if a file was read before it was edited. These are views over the log. Apply a predicate to a view and you have a reminder trigger. “When the last 3 bash results are all errors” is a view (last 3 bash results) plus a predicate (all errors). When the predicate is true, fire the reminder.
What Anthropic reminds Claude about
The claude-code-system-prompts repo extracts all 37 system reminders from Claude Code’s source. They break into clear categories:
File state - truncated reads, empty files, files modified by linters, files opened in IDE, lines selected by user. The harness watches what happens to files and tells the model.
Context management - token usage warnings, post-compaction notices (“file contents may have been summarized away”), budget tracking. The harness watches resource consumption.
Task tracking - gentle nudges to use task tools if they haven’t been used recently. “The task tools haven’t been used recently… This is just a gentle reminder - ignore if not applicable.”
Plan mode - five variants covering active plan mode, re-entry, subagent behavior. The harness enforces planning workflows.
Security - after every file read, a reminder to consider whether the content is malware. Not blocking - just awareness.
The pattern: reminders reinforce harness mechanisms. They don’t replace tool design or hard rules. They nudge.
Reminders vs rules
This distinction matters. It’s a GOOD idea to block the agent from editing a file it hasn’t read. That’s a tool design decision - the harness enforces it at the tool level. The edit tool checks if the file was read first, and blocks if not.
System reminders are different. They’re soft nudges, not hard blocks. “You’ve used write 3 times, prefer edit for surgical changes.” The model can ignore this. It probably shouldn’t, but it can.
Claude Code specifically chooses to hide these messages from users. You don’t see them in the UI. They’re injected into the conversation silently, between the harness and the model. The user sees the result - better behavior - without the mechanism.
Building my own
I built pi-system-reminders - an open-source implementation for pi. Same DX as pi extensions. Drop a file, get a reminder:
// .pi/reminders/bash-spiral.ts
import type { ExtensionAPI } from "@mariozechner/pi-coding-agent";
export default function (pi: ExtensionAPI) {
let consecutiveFailures = 0;
pi.on("tool_result", async (event) => {
if (event.toolName === "bash") {
consecutiveFailures = event.isError
? consecutiveFailures + 1 : 0;
}
});
return {
on: "tool_execution_end",
when: () => consecutiveFailures >= 3,
message: "3 consecutive bash failures. Stop and rethink.",
cooldown: 10,
};
}
Export a function, get the full extension API, return a reminder object. The extension evaluates when() at the specified lifecycle event and injects <system-reminder> tags when conditions are met.
13 ready-to-use examples in the repo, including ports of Claude Code’s own reminders: token usage warnings, file truncation notices, task tool nudges, post-compaction awareness, and more.
The model sees:
<system-reminder name="bash-spiral">
3 consecutive bash failures. Stop and rethink.
</system-reminder>
And it adjusts. Not because it was forced to. Because it was nudged at the right moment, in the right place, with the right level of trust.
That’s steerability.
pi-system-reminders on GitHub
Install with pi install npm:pi-system-reminders.