You already hand-roll agent loops against the raw API. A workflow is that instinct, lifted into the harness — and the whole skill of using it well is one judgment plus one default.
You've written this loop in Python: call the model, read the tool call, run the tool, feed the result back, repeat. You've paid per token and felt the cost. Hold that picture. Everything below is a relabelling of it — the only new ideas are where the determinism lives and that each step gets a fresh context window.
A Claude Code workflow is a deterministic JavaScript orchestrator that spawns non-deterministic subagents. You write a plain JS script; the harness runs it in the background. Two layers, and keeping them straight is the entire mental model:
Loops, ifs, fan-out, counting, dedup, budget checks. Ordinary JS. Runs the same way every time. This is where you put control flow you don't want an LLM to improvise.
agent() callsEach one spawns a fresh subagent — its own context window, its own tools — does a task, and returns its result to your script. This is the LLM call in your hand-rolled loop.
So the mapping from your Python loop:
| Your hand-rolled loop | A Claude Code workflow |
|---|---|
Your for / while orchestration code | The workflow script (deterministic JS) |
One messages.create() API call + its tool loop | One agent(prompt) call (a whole subagent, not one turn) |
| Same conversation, growing context | Fresh, isolated context per agent() — they don't see each other |
| You parse the model's text yourself | agent(prompt, {schema}) returns a validated object |
| Your token meter ticking up | A shared budget across the whole run + the main loop |
The crucial difference from a single subagent (one Agent call) is the deterministic glue. A lone subagent decides its own steps. A workflow lets you dictate the structure — fan out exactly 8 readers, verify every finding with 3 skeptics, loop until two dry rounds — and the model only fills in the leaf tasks. That's the orchestrator-workers pattern from Anthropic's agent taxonomy, made concrete. Building Effective Agents ↗
agent() call. Structure you care about belongs in the deterministic layer.Using workflows well is mostly knowing when not to. A workflow can spawn dozens of subagents and burn a lot of tokens — so it has to buy something a single agent can't. There are three reasons it does, and if none apply, don't reach for it:
Workflows are an explicit opt-in tool. In normal Claude Code use you (or the harness) only launch one when you ask for it — the keyword ultracode, an explicit "use a workflow / fan out agents," or a skill that calls it. That gate exists because the cost is real. So your private test before authoring one is blunt:
"Would a single capable agent do this about as well? Then don't fan out."
A refactor, a bugfix, a one-file edit, a quick lookup — solo. Breadth, independent verification, or scale — workflow.
For each task, decide solo agent or workflow before you open the answer. Say why in one word (breadth / confidence / scale / none).
getUser to fetchUser across the repo."publish.sh script deploy?"1 · Solo — none. A scripted rename or one agent handles it; fanning out buys nothing. (A purist could pipeline per-file edits in worktrees, but for a rename that's ceremony.)
2 · Workflow — scale and confidence. 400 files exceeds one context, and "double-check each finding" is the adversarial-verify pattern: find → verify, every finding attacked independently.
3 · Solo — none. One Read answers it. Reaching for a workflow here is the classic over-reach.
4 · Workflow — confidence. The judge-panel pattern: generate N independent attempts, score with parallel judges, synthesize the winner. Three rivalrous designs beat one design iterated.
Almost everything is built from three hooks. Learn these and you can read most workflow scripts.
agent(prompt, opts?) — spawn one subagentThe atom. Returns the subagent's final text as a string — or, with a schema, a validated object. Fresh context every time.
const summary = await agent("Read src/auth.ts and summarize the login flow.")
// with structure:
const bugs = await agent("Find auth bugs in src/auth.ts.", { schema: BUGS_SCHEMA })
parallel(thunks) — fan out, then wait for all (a barrier)Runs tasks concurrently and blocks until every one finishes before returning the array. Use it only when the next step genuinely needs all results at once.
const reviews = await parallel(
DIMENSIONS.map(d => () => agent(d.prompt, { schema: FINDINGS }))
)
// nothing past this line runs until the SLOWEST review returns
pipeline(items, stage1, stage2, …) — each item flows through all stages, no barrierEach item runs through every stage independently. Item A can be in stage 3 while item B is still in stage 1. Wall-clock ≈ the slowest single chain, not the sum of slowest-per-stage.
const results = await pipeline(
DIMENSIONS,
d => agent(d.prompt, { schema: FINDINGS, phase: 'Review' }),
review => parallel(review.findings.map(f => () =>
agent(`Verify: ${f.title}`, { schema: VERDICT, phase: 'Verify' })))
)
// dimension "bugs" verifies while dimension "perf" is still being reviewed
Default to pipeline(). Reach for a parallel() barrier only when stage N truly needs every result from stage N-1 at once.
This is the single highest-leverage habit. A barrier makes your fast workers sit idle waiting for the slowest one. If five finders run and the slowest takes 3× the fastest, a barrier wastes two-thirds of the fast finders' time. A pipeline lets each finding move on the instant it's ready.
A barrier is justified only when stage N needs cross-item context from all of stage N-1:
If you wrote this:
const a = await parallel(...)
const b = transform(a) // flatten / map / filter — no cross-item dependency
const c = await parallel(b.map(...))
…that middle transform doesn't need the barrier. Rewrite as a pipeline with the transform inside a stage. When in doubt: pipeline.
parallel). Dedup-across-the-full-set is the textbook cross-item dependency — you literally can't dedupe finding #1 until you've seen all six reviewers' output. This is one of the few cases that earns the wait. Contrast: if you just verified each finding on its own, that's a pipeline — no reviewer's findings depend on another's.parallel called "a barrier" but pipeline isn't?parallel() awaits all thunks before returning, so the slowest one gates the rest. pipeline() has no such point between stages — each item advances on its own clock. Same total concurrency cap; completely different wall-clock when item durations vary.nullA thunk that throws inside parallel/pipeline becomes null in the results — the call never rejects. Always .filter(Boolean) before using results, or a single failed subagent silently corrupts your downstream logic.
Type annotations, interfaces, generics fail to parse. And Date.now(), Math.random(), argless new Date() all throw (they'd break run-resume). Vary by index for "randomness"; pass timestamps in via args.
You now have the load-bearing model:
agent() = fresh context
Earn the cost: breadth / confidence / scale
agent · parallel(barrier) · pipeline(no barrier)
Pipeline by default
Think of one real task from your own work that passes the cost gate (breadth, confidence, or scale). Don't write the script — just name the task and which of the three reasons it qualifies under. Next session we'll turn it into a real pipeline with structured schema output, and meet the patterns that make workflows trustworthy: adversarial-verify and loop-until-dry.
ultracode, running & saving workflows.agent() spawns and how context isolation works.