From Doer to Team Leader: How to Run a Coordinated Network of AI Agents

This is for non-programmers before it is for programmers — pay attention, because the whole idea lives in how you organize your data, not in whether you can code.

Big projects and continuous work take long hours of focus and late nights. The way to compress that effort is to stop doing every task yourself and start running a network of AI agents that hand work off to each other — turning you from a person who does the task into a leader who directs an entire team of agents. Whether your field is coding or something completely different — management, marketing, or even a regular job — this system is the future, not just a solution for developers.

Four pieces make it work, and they fit together. First, Claude Cowork is the entry point for anyone, technical or not: give it a heavy task and it automatically splits the work into smaller pieces and runs several sub-agents in parallel — you do not configure anything, Claude decides when to fan out. Second, a Second Brain gives those agents the context to work well: your notes, decisions, and client history organized in simple files so the agents can actually find things instead of guessing. Third, when a job is genuinely huge, the ultracode setting in Claude Code plans the task, launches hundreds of parallel sub-agents, and verifies its own output — codebase-scale work done from one instruction. Fourth, GLM — the open-source model family from Z.ai — makes all of this affordable, delivering most of the capability of the top models at roughly a tenth of the cost, so the barrier stops being money.

The through-line is simple: the person moves up a level. You stop being the one grinding through every step and become the one who defines the goal, organizes the context, and directs the agents. That shift is the real skill of 2026 — and it belongs to everyone, not just engineers.

Follow for more:

  • https://www.instagram.com/ai.with.mo/
  • Course Registration: https://halaqa.app/enrollment?course=start-with-ai

    The Real Shift: You Stop Doing, You Start Directing

    The instinct most people have with AI is to use it as a faster way to do their own work: a better search, a quicker draft, a smarter answer. That keeps you as the worker — you are still the one grinding through every step, just slightly faster. The shift that actually changes your output is different: you stop being the person who does the tasks and become the person who directs a team of agents that do them. This is not a coding concept. A marketer can run agents that research trends, draft posts, and update a content calendar. An operations person can run agents that process feedback, update records, and prepare reports. An employee in almost any role has repeatable work that can be handed off. The reason this matters now, and did not a year ago, is that the tools finally exist to make agents hand work to each other reliably — and the honest truth in the hook is exact: the whole thing lives in how you organize your data, not in whether you can code. Get the organization right and the agents can find what they need; get it wrong and they guess, hallucinate, and waste your time. The rest of this is the four pieces that turn that idea into a working system.

    Cowork and the Second Brain: The Foundation Anyone Can Build

    The entry point for everyone is Claude Cowork, because it requires no technical setup. Running inside the Claude Desktop app, you give it access to a folder and a heavy task, and it does something most people never notice at first: it automatically breaks the work into smaller pieces and runs several sub-agents in parallel to get through them. You do not name them, assign them, or configure anything — Claude itself decides when a task is complex enough to fan out, and on a substantial job you will literally see multiple sub-agents working at once in the progress view. That is already a team of agents, handed to you with zero configuration. But agents are only as good as the context they work from, which is where the Second Brain comes in. A Second Brain is nothing exotic — it is your notes, decisions, and client history organized as simple markdown files and folders, with a router file at the top that tells the agent where everything lives. The discipline is to ingest the evergreen material you will still want in a year — locked decisions, durable notes — and to keep the weekly-changing noise like emails and chat threads out, giving the agent access to fetch that on demand instead. Do this and your agents stop guessing: they know where to look, they find the right thing, and the work they hand each other is grounded in your real data rather than invented. Cowork gives you the team; the Second Brain gives the team its memory.

    Ultracode: When the Job Needs Hundreds of Agents at Once

    Cowork handles everyday heavy tasks, but some jobs are on a different scale entirely, and that is where ultracode comes in — the top setting on Claude Code's effort slider, defined as xhigh reasoning plus dynamic workflows. When you turn it on and hand it a large task, Claude does not answer in the usual sense. It writes an orchestration plan, launches hundreds of parallel sub-agents in a single session — up to a documented ceiling of a thousand agents per run with sixteen working concurrently — assigns each one a specific piece, and then runs a separate layer of adversarial agents that try to disprove the findings, iterating until the results hold up. Only a single verified answer reaches you. This is what makes codebase-scale work possible: migrations across hundreds of thousands of lines, service-wide audits, done from one instruction. The honest caveat is that ultracode is not a default — it consumes the most tokens and time of any setting, so you reach for it deliberately when a task is genuinely large, decomposable, and worth the spend, and you stay on the lighter levels for everything else. It sits at the top of the same ladder as Cowork's automatic sub-agents: same core idea of agents working in parallel, scaled up to its most powerful form for the rare jobs that truly need it.

    GLM: The Model That Makes the Whole Thing Affordable

    The piece that removes the last barrier is the model itself. Running teams of agents on the most premium models can get expensive fast, and that cost is exactly what stops most non-programmers and small teams from starting. GLM is the answer to that. It is the open-source model family from Z.ai — the Chinese lab formerly called Zhipu AI, which became the first publicly traded foundation-model company after its Hong Kong IPO in early 2026 — released under a permissive MIT license and tuned specifically for agentic, tool-using coding work. Crucially, GLM runs inside the exact same tools this whole system uses: Claude Code, Cline, Cursor, OpenCode, and more. Its GLM Coding Plan starts around eighteen dollars a month, a fraction of the hundred-to-two-hundred-dollar Claude Max tier, and its API runs at roughly half the per-token cost of comparable models. The honest picture is that the very top Claude models still lead on the hardest multi-file reasoning, so the smart pattern most serious builders adopt is not either-or but both: use the strongest model for the genuinely hard architectural work, and GLM for the high-volume everyday execution where its cost advantage compounds. For an Arabic-speaking creator or a small business watching every dollar, GLM is what turns this from an enterprise-only idea into something you can actually run today — matching the right model to the right task, and paying accordingly. Put the four pieces together — Cowork for the team, a Second Brain for context, ultracode for the huge jobs, and GLM to keep it affordable — and you have stopped being a person doing tasks and become someone directing a coordinated network of agents. That is the future the hook promised, and it is available now, to everyone.

    Prompt

    # THE AGENT-TEAM STACK — FOUR PIECES THAT FIT TOGETHER
    # For everyone, not just developers.
    
    # ─── PIECE 1: CLAUDE COWORK (the entry point — automatic, non-technical) ───
    # Claude Desktop app (macOS/Windows), paid plans. Give it a folder + a heavy task.
    # It AUTOMATICALLY splits the work and runs sub-agents in parallel.
    # You do NOT configure agents — Claude decides when to fan out.
    "Go through all 40 files in /client-feedback, group complaints into themes,
     pull 3 quotes per theme, and write a prioritized summary to feedback.md."
    # You will see multiple sub-agents running at once in the progress view.
    
    # ─── PIECE 2: A SECOND BRAIN (context so the agents work well) ───
    # Just markdown files + folders, organized so agents can FIND things.
    # A CLAUDE.md at the root acts as a ROUTER:
    """
    # WHERE THINGS LIVE
    - About me / how I work → /context/about-me.md
    - Locked decisions (dated log) → /context/decisions.md
    - Projects & clients → /projects/<name>.md
    Always read /context first. Do not search everything blindly.
    """
    # Rule: INGEST evergreen data (decisions, durable notes).
    #       Do NOT ingest weekly-changing noise (emails, Slack) — give the
    #       agent ACCESS to fetch it on demand (route to CRM/ClickUp via MCP).
    # Start at the lowest level that removes your pain; level up only when it hurts.
    
    # ─── PIECE 3: ULTRACODE (Claude Code — for genuinely huge jobs) ───
    # The top effort level: /effort ultracode  = xhigh + dynamic workflows.
    # Claude plans the task, launches HUNDREDS of parallel sub-agents (up to
    # 1,000 per run, 16 concurrent), then verifies its own output.
    # Use it for codebase-scale work: large migrations, service-wide audits.
    # Do NOT leave it on for everyday tasks — it spends the most.
    
    # ─── PIECE 4: GLM (Z.ai) — the affordable engine ───
    # Open-source model family (MIT license) from Z.ai (formerly Zhipu AI).
    # Runs INSIDE the same harnesses: Claude Code, Cline, Cursor, OpenCode, etc.
    # GLM Coding Plan: from ~$18/mo (Lite) — a fraction of Claude Max ($100-200).
    # API: ~$1.40 / M input, ~$4.40 / M output — roughly half of Sonnet.
    # Trade-off: top Claude models still lead on the hardest multi-file work;
    #   GLM is the value pick for high-volume, everyday agentic coding.
    # Practical combo used by many: Claude for the hardest reasoning,
    #   GLM for the bulk execution — match the model to the task.
    
    # ─── HOW THE PIECES COMBINE (the workflow) ───
    # 1. Organize your data as a Second Brain so agents have real context
    # 2. Hand everyday heavy tasks to Cowork — automatic parallel sub-agents
    # 3. Escalate huge, decomposable jobs to ultracode in Claude Code
    # 4. Point the harness at GLM to keep the volume affordable
    # You direct; the agents execute and hand off to each other.
    
    # ─── THE MINDSET SHIFT ───
    # Old: you are the worker doing every task in sequence.
    # New: you define the goal + organize the context + direct the agents.
    # This applies to marketing, admin, ops — not only code.