11 April 2026

The four kinds of AI agents (and what's actually behind Claude Code)

"Agent" currently means everything from an FAQ widget to a system that refactors your codebase overnight. The word isn't wrong, exactly. They do share an anatomy: a model in a loop, a manifest of tools, a system prompt, some memory, some guardrails. But treating them as one thing is how teams ship the wrong one — I've watched it happen.

What actually separates the kinds is three questions: what tools does it hold, what may it write to, and how long does it run without a human? Four answers cover almost everything I've built or used.

1. The grounded answerer

Tools: none, or nearly none. Writes: nothing. Autonomy: one reply at a time.

This is the website agent, the docs assistant, the support deflector. Its entire power source is a curated knowledge base compiled into its prompt — ours, on our consultancy's site, is a versioned folder of markdown with a hard token budget that fails the build when exceeded. No live systems, no side effects; the engineering effort goes into refusals and grounding, because its only real risk is saying something wrong in public.

Don't let the simplicity fool you — for "answer visitors honestly", this is the correct amount of agent.

2. The tool-wielding assistant

Tools: dozens, spanning live systems. Writes: drafts and templates, human pulls the trigger. Autonomy: one conversation turn at a time.

This is Cassie's drawer agent — 46 typed tools across case data, calendar, email, documents, analytics, accounting and explicit memory. It collapses "which app do I open?" into a sentence, but every consequential action lands as a reviewable draft. The human stays the executor; the agent is a researcher-secretary with perfect patience.

Most business value today lives in this kind. It's also the kind most teams should build first: read-heavy, write-light, trust accumulates with use.

3. The coding agent — what's actually behind Claude Code

Tools: the most dangerous ones — a filesystem and a shell. Writes: your repo. Autonomy: minutes to hours, gated by permissions.

From the outside, Claude Code looks like a chat about code. I live inside it daily, so let me describe what's actually back there: the same anatomy, with a radically different manifest. Read and edit files. Run shell commands. Search the codebase and the web. Spawn subagents for parallel work. The "application" it integrates is your entire development environment, because git, test runners, linters and deploy CLIs are all reachable through one tool — the shell. That single tool is why its capability feels unbounded.

What makes it shippable is the governance bolted around the loop: permission modes deciding which calls need a human yes, hooks that intercept tool calls deterministically, skills that inject process, context compaction so long sessions survive. The model proposes; the harness disposes. That division, not model quality alone, is the product.

I lean on this daily: my AppMaker project is precisely a governance layer on top of this kind of agent — specs, traceable acceptance criteria and gates, because an agent holding a shell deserves the same discipline as an engineer holding production access.

4. The autonomous worker

Tools: any of the above. Writes: for real. Autonomy: the defining feature — it runs while you're gone.

Scheduled routines, background fix-it loops, overnight migrations. The honest version of this kind is defined by its budgets: iteration caps, cost caps, mandatory human review before anything irreversible. In AppMaker the autonomous runner refuses slices that aren't explicitly marked safe for it, and reports what it did to a file a human reads in the morning.

The hierarchy of trust is unforgiving: nobody should ship kind 4 before their kind 3 sessions stop surprising them, and nobody gets kind 3 right without the tool discipline of kind 2.

The lesson across all four

The model is the same in every kind. Often literally the same weights. What changes is the manifest, the write access and the leash — which is why "is the AI good enough?" is usually the wrong question. The right one: is the system around it designed for the blast radius of its tools?

Pick the smallest kind that solves your problem. You can always hand the agent another tool tomorrow. Taking one back after an incident is a much worse afternoon.