Cross-framework enhancement overlay for choosing a multi-agent topology BEFORE writing any agent. A binary-question rubric — is single-agent + tools enough?...
SKILL.md

---
name: agentsop-agent-topology-selection
version: 0.1.0
description: >-
  Cross-framework enhancement overlay for choosing a multi-agent topology BEFORE writing any
  agent. A binary-question rubric — is single-agent + tools enough? do agents need to know
  about each other? does the output need one voice? — maps the answer to single-agent /
  supervisor / swarm / sequential / hierarchical. Activates when a coder agent is tempted to
  "split the work into roles" or reaches for a multi-agent framework. Encodes the *selection
  rubric* that the per-framework skills assume but never surface. Search keywords: when to
  use multi-agent, single vs multi agent, do I need multiple agents, supervisor vs swarm,
  multi-agent vs single agent, agent team design.
overlay: true
cross_links: [crewai, langgraph, bounded-loop]
---

# Multi-Agent Topology Selection · SOP (ENHANCE overlay)

> Overlay posture: this skill decides *whether and which* topology. It does not
> teach the API — descend to `[[crewai]]` or `[[agentsop-langgraph]]` for that. Every
> load-bearing claim carries an inline source tag resolving in
> `references/R1-source-evidence.md`.

---

## 1. 何时激活 (When to Activate)

Activate when **any** of the following fire:

- The task description contains "team of agents", "researcher + writer + reviewer",
  "manager agent", "agents that hand off", "split this into roles", or "multi-agent".
- A coder agent is about to instantiate ≥2 agents (CrewAI `Agent(...)` × N,
  LangGraph supervisor/swarm, OpenAI Swarm handoffs) and has **not yet** justified
  why a single agent with tools is insufficient.
- Someone is choosing between CrewAI `Process.sequential` vs `Process.hierarchical`,
  or LangGraph supervisor vs swarm vs hierarchical-teams, and wants the *rubric*,
  not the syntax.
- A multi-agent system is over budget on tokens/latency and the question is "can we
  collapse agents back into one?".

Do **not** activate for: a single LLM call, a one-shot RAG query, or a fixed
tool-call pipeline with no role separation. Those are the single-agent baseline
this skill defends.

> Mental check: *"An agent needs agency, otherwise it's just another script."*
> — João Moura, CrewAI founder `[[crewai · §1.3]]`. If you can write the control
> flow in `if/else`, you do not need multiple agents — you need one agent (or a
> graph) with explicit edges.

---

## 2. 核心心智模型 (Core Mental Model)

**Most "multi-agent" problems are single-agent + tools.** Add agents only when
*context isolation* or *parallel expertise* genuinely demands it.

> "Single-agent is right for approximately 80% of cases; the trap is reaching for
> multi-agent because it sounds more capable." `[[crewai · DC-1]]`

Two — and only two — forces justify a second agent:

1. **Context isolation.** One agent's working context would pollute another's
   (a critic that must not see its own draft's rationalisations; a tool-heavy
   sub-task whose 40 intermediate tool calls should not bloat the main thread).
   Splitting gives each agent a clean, bounded prompt.
2. **Parallel expertise.** Two *genuinely different* skills run concurrently or in
   strict sequence (research → write → review), where a single prompt provably
   cannot hold both jobs without quality collapse `[[crewai · DC-1]]`.

If neither force is present, **a single agent with the union of tools wins** —
fewer hops, fewer tokens, no handoff failures. This is the baseline the rubric
must beat, not the default to escape.

### The selection rubric (three binary questions)

```
Q0  Is single-agent + tools enough?
      (no context-isolation need, no parallel-expertise need)
        YES → single-agent + tools. STOP. Do not add agents.
        NO  → ↓

Q1  Do the agents need to KNOW ABOUT EACH OTHER (peer handoff)?
        NO  → one funnels through a coordinator → SUPERVISOR
              (or static order → SEQUENTIAL, if order is fixed)
        YES → ↓

Q2  Must the OUTPUT speak with ONE VOICE / single audit funnel?
        YES → SUPERVISOR (single user-facing persona, one funnel)
        NO  → SWARM (dynamic peer handoff, last-active agent remembered)

Scaling override: ≥6 specialists that group into teams → HIERARCHICAL
(supervisor-of-supervisors). Use only for grouping, not for routing.
```

The two questions that actually separate the patterns: **(a) can sub-agents know
each other, (b) is one user-facing voice mandated.** Everything else is tuning
`[[langgraph · Case 2]]`.

---

## 3. SOP 工作流 (Selection Protocol)

Walk top-down. Each gate can send you *back down* the ladder — collapsing agents
is as valid an answer as adding them.

### Step 1 · Defend the single-agent baseline first
Ask Q0. Enumerate the would-be roles. For each, ask: *would merging it into one
agent's prompt + toolset actually degrade output?* If you cannot point to a
concrete failure mode (style drift, missed checklist, context bloat, parallel
latency), the honest answer is single-agent + tools. Exit here ~80% of the time
`[[crewai · DC-1]]`.

### Step 2 · If splitting, decide static vs dynamic routing
- **Order is fixed and known at design time** (research always precedes write
  precedes review) → **SEQUENTIAL**. Cheapest, most debuggable, 1× token baseline
  `[[crewai · §2.3]]`. In CrewAI this is `Process.sequential`; in LangGraph it is
  static edges A→B→C.
- **Routing must be decided at runtime** (which specialist handles *this* query) →
  you need a coordinator or peer handoff → continue to Step 3.

### Step 3 · Coordinator (supervisor) vs peers (swarm)
Ask Q1 then Q2.
- Agents that do **not** know each other and funnel through one router →
  **SUPERVISOR**. Sub-agents are effectively tools the supervisor calls; the
  supervisor "translates" their output back to the user — which is *exactly* why
  it costs the most tokens `[[langgraph · Step 4]]`.
- Agents that **do** know each other and **no** single voice is mandated →
  **SWARM**. Dynamic handoff, last-active agent stays active across turns, no
  translation step → fewer tokens, slightly higher accuracy on the τ-bench retest
  `[[langgraph · §SOP Step 4]]`.

### Step 4 · Apply the supervisor-default caveat (read this twice)
LangChain's *own* benchmark found swarm "slightly outperformed supervisor across
all scenarios" and supervisor "consistently uses more tokens than swarm" — yet
they **still ship supervisor as the recommended default** `[[langgraph · Step 4]]`.
Why the nuance matters:
- Supervisor is the **safest with third-party / untrusted agents** (single funnel,
  single audit log, single place to enforce policy) `[[langgraph · Step 4]]`.
- Swarm is a **bad fit for third-party agents** — peers handing off to peers means
  no central control point `[[langgraph · Step 4]]`.
- So: **do not copy "default = supervisor" blindly.** If your agents are internal
  and trusted, swarm is often the better pick the default hides. Pick on the two
  questions, not on the framework's default.

### Step 5 · Verify the chosen topology can terminate
Any topology with runtime handoff (swarm, hierarchical, CrewAI delegation) can
loop. Bound it before shipping — cross-link `[[agentsop-bounded-loop]]`. Concretely:
default `allow_delegation=False` on workers, set per-agent `max_iter`, wrap an
outer timeout, and bake an explicit exit counter into state rather than trusting
the LLM to stop `[[crewai · DC-5]]` `[[agentsop-bounded-loop]]`.

### Step 6 · Reconsider before scaling agents up
≥6 specialists → group into **HIERARCHICAL teams** purely for *navigability*, not
to get free routing (see Anti-patterns). At >5 agents CrewAI starts hitting
coordination failure `[[crewai · §6.1]]`; that is a signal to group or collapse,
not to add more.

---

## 4. 操作模型 (Operation Models)

Format: **Trigger → Action → Output → Evidence**.

### OP-1 · Defend the single-agent baseline
- **Trigger**: a task is being described as "a team of agents".
- **Action**: list the proposed roles; for each, name the concrete failure that
  merging into one agent would cause (style drift / missed checklist / context
  bloat / required parallelism). No nameable failure ⇒ single agent.
- **Output**: single-agent + tools, OR a justified list of must-split roles.
- **Evidence**: `[[crewai · DC-1]]` "single-agent right for ~80% of cases".

### OP-2 · Single-vs-multi gate
- **Trigger**: baseline defended and at least one role has a real split-justifying
  failure mode.
- **Action**: confirm the force is *context isolation* or *parallel expertise* —
  not "it sounds more capable". If only the latter, stay single.
- **Output**: a yes/no on multi-agent with the force named in one sentence.
- **Evidence**: `[[crewai · §2.2]]`, `[[crewai · DC-1]]`.

### OP-3 · Static-order gate (→ sequential)
- **Trigger**: multi-agent confirmed; the order of work is fixed at design time.
- **Action**: choose SEQUENTIAL — list agents in execution order; pass dependencies
  explicitly (CrewAI `context=[...]`), do not rely on implicit transfer.
- **Output**: an ordered task list; CrewAI `Process.sequential` or LangGraph static
  edges.
- **Evidence**: `[[crewai · §2.3]]` (sequential = 1× tokens, low debug cost).

### OP-4 · Peer-awareness gate (Q1 → supervisor vs swarm branch)
- **Trigger**: routing must be decided at runtime.
- **Action**: ask "do sub-agents know about each other?" — NO ⇒ supervisor branch;
  YES ⇒ continue to OP-5.
- **Output**: chosen branch.
- **Evidence**: `[[langgraph · Step 4]]` decision tree.

### OP-5 · Single-voice gate (Q2 → supervisor vs swarm)
- **Trigger**: peers know each other (Q1=YES).
- **Action**: ask "must output be one voice / single audit funnel?" — YES ⇒
  SUPERVISOR; NO ⇒ SWARM.
- **Output**: SUPERVISOR or SWARM.
- **Evidence**: `[[langgraph · Case 2]]` (compliance ⇒ supervisor; UX continuity ⇒ swarm).

### OP-6 · Apply the supervisor-default caveat
- **Trigger**: supervisor was selected, OR you are about to accept a framework
  default.
- **Action**: check trust. Third-party/untrusted agents ⇒ supervisor is correct.
  Internal/trusted ⇒ re-test whether swarm's lower token cost wins; if so, switch.
- **Output**: a topology chosen on trust + the two questions, not on the default.
- **Evidence**: `[[langgraph · Step 4]]` — swarm beats supervisor on bench, yet
  supervisor remains the shipped default *for third-party safety*.

### OP-7 · Tune-before-switch
- **Trigger**: chosen topology is over token/latency budget.
- **Action**: before changing paradigm, apply the published fixes to the *current*
  one. For supervisor: remove handoff messages, add a forwarding-messages tool,
  optimise tool naming — LangChain measured "nearly 50% increase in performance"
  `[[langgraph · Step 4]]`. Switch paradigm only if still over budget.
- **Output**: a tuned topology or a justified migration.
- **Evidence**: `[[langgraph · Case 2]]`.

### OP-8 · Bound the topology
- **Trigger**: any runtime-handoff topology before ship.
- **Action**: `allow_delegation=False` on workers, per-agent `max_iter`, outer
  timeout, explicit state-based exit counter.
- **Output**: a topology that provably terminates.
- **Evidence**: `[[crewai · DC-5]]` (delegation ping-pong), `[[agentsop-bounded-loop]]`.

---

## 5. 困境决策案例 (Dilemma Cases)

### Case 1 · "Swarm beats supervisor on the benchmark — why ship supervisor?"
- **困境**: LangChain's own multi-agent benchmark shows swarm "slightly
  outperformed supervisor across all scenarios" and supervisor "consistently uses
  more tokens" (the telephone-game translation overhead). Yet LangChain's
  *recommended default* is still supervisor `[[langgraph · Step 4]]`. A coder
  agent copying the default would leave accuracy and tokens on the table.
- **约束**:
  - Want the benchmark's efficiency (swarm).
  - But may integrate third-party / untrusted agents later.
  - Need a single auditable funnel for tool calls (compliance).
- **决策步骤**:
  1. Read the default's *reason*, not the default. Supervisor wins on **safety with
     third-party agents** and **single audit funnel** — not on accuracy `[[langgraph · Step 4]]`.
  2. If all agents are internal and trusted **and** no single-voice mandate →
     pick **swarm**; the default does not apply to you.
  3. If a single auditable funnel is mandated → keep supervisor, then apply the
     three fixes (remove handoff messages, forwarding-messages tool, tool-name
     tuning) for the measured ~50% bump *before* concluding it is too slow
     `[[langgraph · Case 2]]`.
  4. Re-measure tokens; migrate to swarm only if still over budget and audit is
     tolerant.
- **结果**: Topology chosen on trust + the two questions. The benchmark's "swarm >
  supervisor" is true *and* the supervisor default is rational — for a *different*
  constraint (third-party safety) than the one the benchmark measured (accuracy/cost).
- **可提取的操作**: OP-6. **A framework default encodes the framework author's
  worst-case constraint, not yours. Decode the reason; re-derive for your case.**

### Case 2 · "CrewAI hierarchical for simple routing is structurally broken"
- **困境**: 5 agents, order roughly fixed, but the system should *route* — skip
  irrelevant specialists based on the query. `Process.hierarchical` looks like it
  "should auto-route". In practice the manager **executes all tasks** and the last
  task's output overwrites the rest — it does not skip on triage `[[crewai · DC-2]]`.
  ```
  Query: "Why is my laptop overheating?" (pure technical)
  Expected:  triage → technical_agent → done
  Hierarchical reality: triage → technical → billing → … → last output wins
  ```
- **约束**: routing is required; default `manager_llm` is unreliable; latency/cost
  matter.
- **决策步骤**:
  1. Recognise hierarchical-for-routing is the wrong tool: CrewAI hierarchical is a
     *coordination* topology, not a *router* `[[crewai · DC-2]]`.
  2. If routing logic fits in ~5 lines of Python → use a **CrewAI Flow** (`@router`
     / `@listen`) or a **LangGraph conditional edge** — explicit branch, then call
     one small crew / single agent per branch `[[crewai · DC-2]]`.
  3. If routing genuinely needs LLM semantic judgement → hierarchical **with a
     custom `manager_agent`** carrying an explicit branching backstory — *never*
     the bare default `manager_llm` `[[crewai · OP-4]]`.
  4. Bound it: workers `allow_delegation=False`, outer timeout `[[agentsop-bounded-loop]]`.
- **结果**: Routing handled by explicit control flow; hierarchical reserved for
  true coordination of trusted teams, never for if/else routing.
- **可提取的操作**: **Hierarchical ≠ router. Routing is control flow — make it
  explicit (Flow / conditional edge), don't delegate it to a manager LLM that will
  run everything.** Compare with `d-query-routing-skill` for the routing-specific rubric.

---

## 6. 反模式与边界 (Anti-patterns & Boundaries)

- **Multi-agent theater.** Splitting into roles because it "sounds more capable"
  with no context-isolation or parallel-expertise force. Symptom: agents that just
  pass a string along, each adding a paragraph. Fix: collapse to one agent + tools
  `[[crewai · DC-1]]`.
- **Hierarchical for simple routing.** Using `Process.hierarchical` (or a
  supervisor) to get free if/else routing. It runs everything; routing must be
  explicit control flow (Flow / conditional edge) `[[crewai · DC-2]]`.
- **Copying the supervisor default blindly.** The benchmark says swarm is better on
  accuracy *and* tokens; supervisor is the default for *third-party safety*. Internal
  trusted agents should reconsider swarm `[[langgraph · Step 4]]`.
- **Agent-count explosion (>5).** Coordination failure, token blow-up, debug pain.
  ≥6 ⇒ group into hierarchical teams *for navigability*, or merge near-duplicate
  roles `[[crewai · §6.1]]`.
- **Unbounded delegation.** `allow_delegation=True` on every agent ⇒ ping-pong
  loops. Default off on workers; bound with `max_iter` + timeout `[[crewai · DC-5]]`,
  `[[agentsop-bounded-loop]]`.
- **Picking on aesthetics.** Choosing supervisor/swarm/sequential by which "feels
  cleaner" instead of the two binary questions (peer-awareness, single-voice)
  `[[langgraph · Case 2]]`.

**Hard boundaries (this rubric does NOT decide):**
- *Which framework* — see `[[crewai]]` vs `[[agentsop-langgraph]]` ecosystem sections.
- *Routing by query kind* — that is `d-query-routing-skill`.
- *Bounding the loop* — that is `[[agentsop-bounded-loop]]`.
- *Latency < 200ms / single LLM call* — no multi-agent framework at all
  `[[langgraph · 反模式]]`.

---

## 7. 跨框架对照 (Cross-Framework Mapping)

| Topology | CrewAI | LangGraph | OpenAI Swarm | Choose when |
|---|---|---|---|---|
| **Single-agent + tools** | one `Agent` + tools (skip Crew) | `create_react_agent` | one routine | Q0=YES — ~80% of cases `[[crewai · DC-1]]` |
| **Sequential** | `Process.sequential` + `context=[...]` | static edges A→B→C | linear handoffs | order fixed at design time `[[crewai · §2.3]]` |
| **Supervisor** | `Process.hierarchical` + custom `manager_agent` | supervisor pattern (sub-agents as tools) | central routine dispatching | peers don't know each other; one voice / third-party safety `[[langgraph · Step 4]]` |
| **Swarm** | (no native; Flow + handoff funcs) | swarm pattern (dynamic handoff) | `handoff` between agents | peers know each other; no single-voice mandate; internal/trusted `[[langgraph · Step 4]]` |
| **Hierarchical teams** | nested crews via Flow | supervisor-of-supervisors / subgraphs | n/a | ≥6 specialists needing grouping `[[crewai · §6.1]]` |

Notes:
- **CrewAI** frames agents as role-playing teammates; `hierarchical` is coordination,
  *not* routing — the default `manager_llm` runs all tasks `[[crewai · DC-2]]`.
- **LangGraph** frames topology as routing logic over typed state; supervisor is the
  shipped default for third-party safety despite swarm winning the bench
  `[[langgraph · Step 4, Case 2]]`.
- **OpenAI Swarm** is the minimal handoff baseline — OpenAI labels it experimental;
  use as reference, not production `[[langgraph · 生态对照]]`.
- **Single-agent + tools** is the baseline every topology above must beat. Defend it
  first (OP-1).

> Pick the smallest topology that fits the two questions; promote upward only when a
> named force demands it, and collapse back down when the force disappears.

---

## 附录: 引用 (Citations)

Inline tags resolve to source-skill sections in `references/R1-source-evidence.md`:
- `[[crewai]]` = `/Users/5imp1ex/Desktop/Skill-Workplace/output/crewai-sop-skill/SKILL.md`
- `[[agentsop-langgraph]]` = `/Users/5imp1ex/Desktop/Skill-Workplace/output/langgraph-sop-skill/SKILL.md`
- The benchmark: LangChain, *Benchmarking Multi-Agent Architectures*
  (`www.langchain.com/blog/benchmarking-multi-agent-architectures`), surfaced via
  `[[langgraph · Step 4 / Case 2]]`.
Agentsop Agent Topology Selection

SKILL.md

related skills