Inside Ultracode: How One Setting Launches Dozens of Agents and Orchestrates Them Automatically
Ultracode is the top setting on Claude Code's effort slider, and it is not just a more powerful version of the levels below it. Activated with /effort ultracode, it combines xhigh reasoning with automatic workflow orchestration — and it fundamentally changes who is doing the coordinating.
Here is what actually happens when ultracode is on. You give Claude a large task. Instead of answering turn by turn, Claude writes a JavaScript orchestration script on the fly, specific to that task. That script then fans the work out across dozens of parallel subagents — tens to hundreds in a single run, with a documented ceiling of up to 1,000 total agents per workflow and 16 running concurrently. Each agent is assigned a specific, bounded job. Then a separate layer of adversarial agents tries to refute their findings, and the run keeps iterating until the results converge under that scrutiny — before a single verified answer ever reaches you.
The breakthrough is where the coordination lives. In standard Claude Code, Claude itself is the orchestrator, making every decision turn by turn while intermediate results pile up in its context window until it runs out of room. With dynamic workflows, the generated JavaScript script is the orchestrator. The model spends its reasoning on the judgment inside each agent; the script handles the order, the routing, the looping, and the merging — and that coordination costs zero model tokens. In one documented run, 113 agents consumed 1.95 million tokens of actual work, while the JavaScript coordinating all of them spent nothing. Claude's context window ends up holding only the final verified answer, not the exhaust of every intermediate step.
This is what lets ultracode take on jobs that used to be measured in quarters. The flagship proof: Bun's creator used dynamic workflows to port Bun from Zig to Rust — roughly 750,000 lines of Rust, with 99.8% of the existing test suite passing, in eleven days from first commit to merge.
The shift underneath all of this is simple and large. Orchestration became a model decision instead of a developer decision. You used to write the harness that coordinated the agents. Now you write the goal, the constraints, and the trust boundaries — and Claude writes the harness to match.
Follow for more:
Course Registration: https://halaqa.app/enrollment?course=start-with-ai
Ultracode Is Not a Stronger Setting — It Is a Different One
It is tempting to read ultracode as simply the highest notch on a slider — a little more effort than max. That misreads what it does. Ultracode, activated with /effort ultracode, combines xhigh reasoning effort with automatic workflow orchestration, and that second half is the real change. With the levels below it, Claude answers your request directly, however deeply it thinks. With ultracode on, Claude first evaluates whether the task is large enough to warrant a workflow, and if it is, it does not answer at all in the usual sense. Instead it plans and launches an orchestration of many agents working in parallel. The same capability can also be triggered for a single task by simply including the word workflow in your prompt, without changing the session's effort level. The distinction matters because it changes what you are actually doing when you turn it on: you are not asking one very smart model to think harder, you are asking it to become the manager of a temporary team it assembles for the job.
How Dozens of Agents Get Launched — and What Each One Does
When a workflow triggers, Claude breaks the task into subtasks and dispatches a fleet of agents to work in parallel — tens to hundreds in a single run, with a documented hard ceiling of up to 1,000 total agents per workflow and a maximum of 16 running concurrently to keep local resource use bounded. Each agent is handed a specific, bounded job rather than a vague share of the work. A common pattern is an understand-change-verify loop: one cluster of agents maps the architecture, a second cluster executes the changes, and a third cluster verifies the results, looping until verification passes. What makes this trustworthy rather than just fast is the adversarial layer. After the working agents produce their findings, a separate set of agents actively tries to refute those findings — and the run keeps iterating until the results converge under that scrutiny. Only then does a single, verified answer reach you. This is the difference between many agents producing a lot of plausible output, and many agents producing one answer that has already survived an internal attempt to break it.
The Real Breakthrough: Coordination Moves Out of the Model
The most important detail is the least visible one. In standard Claude Code, Claude itself is the orchestrator: it makes every decision turn by turn, and the intermediate results of each step accumulate inside its context window until the window fills and quality degrades. Dynamic workflows move the orchestration out of the model entirely. When a workflow triggers, Claude writes a short JavaScript program on the fly — a real script that holds all the loops, the branching logic, the routing decisions, and the intermediate variables. That script is the orchestrator. The model spends its expensive reasoning only on the judgment inside each agent; the cheap code decides the order, which stage runs when, and what carries forward — and that coordination costs zero model tokens. The numbers make this concrete: in one documented run, 113 agents consumed 1.95 million tokens doing actual work, while the JavaScript coordinating all of them spent nothing. The payoff is that Claude's context window ends up holding only the final verified answer, not the accumulated exhaust of every intermediate step — which is exactly why a workflow can sustain work far larger than any single context window could ever hold.
The Proof, and Knowing When to Reach for It
The flagship demonstration is concrete enough to settle any doubt about scale. The creator of Bun used dynamic workflows to port the entire project from Zig to Rust: roughly 750,000 lines of Rust, with 99.8% of the existing test suite passing, completed in eleven days from first commit to merge — work that would traditionally be planned in quarters. The workflow mapped correct Rust lifetimes for every struct field in the original codebase, then ran hundreds of parallel agents with independent reviewers checking the output. But ultracode is not a default to leave running. It costs the most tokens and time of any setting, so it earns its place only when three conditions are present together: the task is too large for a single context window, the right way to split it is not known in advance, and result quality matters more than token economy. That describes service-wide bug hunts, large migrations, security audits, performance reviews, and architecture analysis — not simple edits or single-file changes. The deeper shift is worth sitting with: orchestration used to be the developer's job. You wrote the harness that coordinated the agents. Now you write the success criteria, the constraints, and the trust boundaries, and Claude writes the harness to match. The human moved up a level — from coordinating the work to defining what success means.
Prompt
# ULTRACODE & DYNAMIC WORKFLOWS — HOW TO USE # Requires Claude Code v2.1.154+ (check: claude --version) # Plans: Max & Team on by default · Pro manual · Enterprise admin enables # ─── THREE WAYS TO TRIGGER A WORKFLOW ─── # 1. Turn ultracode ON for the whole session (Claude decides per task): /effort ultracode # With this on, Claude plans a workflow for every substantive task. # 2. Trigger ONE workflow for a single task — just include the word workflow: "Run a workflow to find every endpoint under src/routes/ missing authentication." # 3. Run a bundled or saved workflow command: /deep-research # ─── WHAT HAPPENS UNDER THE HOOD ─── # Claude writes a JavaScript orchestration script on the fly for the task. # - The script fans work across tens to hundreds of parallel subagents # - Hard ceiling: up to 1,000 total agents per run, 16 concurrent # - Each agent gets a specific, bounded job # - A separate layer of ADVERSARIAL agents tries to refute the findings # - The run iterates until results converge → one verified answer # - The coordinating JavaScript costs ZERO model tokens # ─── THE UNDERSTAND-CHANGE-VERIFY LOOP (automatic with ultracode) ─── # Cluster 1 → map the architecture # Cluster 2 → execute the changes # Cluster 3 → verify the results # Loops until verification passes. # ─── BUILT-IN SAFEGUARDS ─── # Token budget → cap how much the whole run can spend # worktree isolation → each agent gets its own copy of the repo (no clobbering) # Concurrency cap → capped automatically; a run cannot fork-bomb your machine # Resumability → interrupt a run; finished agents return cached results, # only the remaining ones run live # Structured output → hand agents a JSON Schema; invalid returns are retried # Progress view → each phase shows agent counts, token totals, elapsed time # ─── WHEN TO USE IT (all three conditions together) ─── # 1. The task is too large for a single context window # 2. The split strategy is unknown in advance # 3. Result quality matters more than token economy # Good fits: service-wide bug hunts, large migrations, security audits, # performance reviews, architecture analysis # Bad fits: simple edits, single-file changes, anything speed/cost-sensitive # ─── ULTRACODE vs AGENT TEAMS ─── # Agent Teams → you define a FIXED, named roster of roles yourself # Dynamic Workflows (ultracode) → Claude writes a FRESH harness on demand # You write the success criteria; Claude writes the orchestration. # ─── PRO TIPS ─── # Do NOT leave ultracode on for everyday work — it spends the most # Reserve it for big, parallelizable jobs with a clear pass/fail bar (tests) # Always set a token budget on large runs # Use the progress view to watch agent counts and spend in real time