flowchart TD
A["🤖 Multi-Agent Patterns"] --> B["1. Sequential / Pipeline"]
A --> C["2. Orchestrator-Worker"]
A --> D["3. Parallel Fan-Out → Fan-In"]
A --> E["4. Reflection / Self-Critique"]
A --> F["5. Router / Dispatch"]
A --> G["6. Planning + Execution"]
A --> H["7. Handoff"]
A --> I["8. Evaluator-Optimizer Loop"]
style A fill:#f0ebe4,stroke:#0d7c5f,color:#1a1a1a
style B fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
style C fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
style D fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
style E fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
style F fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
style G fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
style H fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
style I fill:#faf6f1,stroke:#0d7c5f,color:#1a1a1a
Multi-Agent AI Patterns: A Developer’s Field Guide
The Gist
We solved the monolith problem in software engineering decades ago — we broke systems into microservices, each doing one thing well, communicating through clean interfaces. The same evolution is now happening in AI. When a single agent tries to research, write, code, review, and deploy all at once, it loses context, hallucinates, and forgets what it said three steps ago. The fix is the same: decompose.
Suman Das’s April 2026 article provides the most practical taxonomy I’ve seen of multi-agent design patterns — the architectural templates that define how multiple AI agents coordinate to solve problems that overwhelm any single agent. He identifies eight core patterns and three emerging ones, each with clear when-to-use / when-not-to-use guidance and framework support across LangGraph, CrewAI, AutoGen, Google ADK, and the Anthropic Agent SDK.
The key insight isn’t whether to use multiple agents — it’s which pattern fits your problem and when. Das provides a decision framework: start simple, add complexity only when you need it.
Why It Matters Now
Every major AI framework now supports multi-agent orchestration, but the documentation tends to show you how to wire agents together without telling you which wiring pattern to choose. This article fills that gap. If you’re building anything beyond a single chatbot — a research pipeline, a code generation system, a customer support platform — you’re implicitly choosing one of these patterns whether you know it or not. Knowing the taxonomy means making that choice deliberately.
The timing is also significant. As of early 2026, LangGraph, CrewAI, AutoGen, Google ADK, and the Anthropic Agent SDK have all matured their multi-agent APIs. The patterns Das describes aren’t theoretical — they map directly to production-ready framework primitives.
The Eight Core Patterns
1. Sequential / Pipeline
Agents in a straight line — each does its job and passes the result to the next. The assembly line of multi-agent systems.
When to use: Clear, ordered stages where each step has a distinct responsibility and you need auditability at every step.
When NOT to use: Stages are independent (use Parallel instead), or latency matters — this pattern is only as fast as the slowest agent.
Sample Use Case: A content publishing pipeline — Agent A extracts key points from raw research, Agent B transforms them into a blog draft, Agent C validates facts and grammar, Agent D generates the final summary and SEO metadata.
# Sequential Pipeline with LangGraph
from langgraph.graph import StateGraph, END
def extract_agent(state):
"""Agent A: Extract key points from research."""
prompt = f"Extract the 5 key findings from: {state['raw_research']}"
state["key_points"] = llm.invoke(prompt)
return state
def draft_agent(state):
"""Agent B: Transform key points into a blog draft."""
prompt = f"Write a blog post based on these points: {state['key_points']}"
state["draft"] = llm.invoke(prompt)
return state
def validate_agent(state):
"""Agent C: Fact-check and grammar review."""
prompt = f"Review this draft for accuracy and grammar: {state['draft']}"
state["validated_draft"] = llm.invoke(prompt)
return state
def summarize_agent(state):
"""Agent D: Generate summary and SEO metadata."""
prompt = f"Create a summary and SEO tags for: {state['validated_draft']}"
state["final_output"] = llm.invoke(prompt)
return state
# Wire the pipeline
graph = StateGraph(dict)
graph.add_node("extract", extract_agent)
graph.add_node("draft", draft_agent)
graph.add_node("validate", validate_agent)
graph.add_node("summarize", summarize_agent)
graph.add_edge("extract", "draft")
graph.add_edge("draft", "validate")
graph.add_edge("validate", "summarize")
graph.add_edge("summarize", END)
graph.set_entry_point("extract")
pipeline = graph.compile()
result = pipeline.invoke({"raw_research": "..."})| Framework | Implementation |
|---|---|
| LangGraph | Native linear graph edges |
| CrewAI | Process.sequential |
| AutoGen | initiate_chats with sequential carryover |
| Google ADK | SequentialAgent (native workflow agent) |
| Anthropic Agent SDK | Claude chains steps through its reasoning loop |
2. Orchestrator-Worker (Hierarchical)
A smart manager agent that understands the big picture, dynamically decides what sub-tasks to create, delegates to specialists, monitors progress, and stitches results together. The key word is dynamic — the orchestrator reasons about the task and may change its plan mid-execution.
How is this different from Parallel Fan-Out? In Fan-Out, you know all sub-tasks upfront. Here, the orchestrator figures out the sub-tasks at runtime, may run some in sequence and others in parallel, and can reassign or retry if something fails.
When to use: Sub-tasks aren’t known upfront, workers may have dependencies on each other’s output, you need centralized coordination and adaptive replanning.
When NOT to use: The task is simple enough for a single agent, or all sub-tasks are independent and known upfront (use Parallel instead).
Sample Use Case: An e-commerce order system — the orchestrator delegates to Inventory, Payment, and Shipping agents. If Inventory reports “out of stock,” the orchestrator adapts by asking a Recommendation agent to suggest alternatives.
# Orchestrator-Worker with CrewAI
from crewai import Agent, Task, Crew, Process
orchestrator = Agent(
role="Project Manager",
goal="Coordinate the team to fulfill customer orders",
backstory="Expert at breaking down complex orders and delegating.",
llm="gpt-4o"
)
inventory_agent = Agent(
role="Inventory Specialist",
goal="Check stock availability and report status",
backstory="Has access to the warehouse database.",
tools=[inventory_lookup_tool],
llm="gpt-4o-mini"
)
payment_agent = Agent(
role="Payment Processor",
goal="Process payments securely",
backstory="Handles all payment gateway interactions.",
tools=[payment_tool],
llm="gpt-4o-mini"
)
shipping_agent = Agent(
role="Shipping Coordinator",
goal="Calculate delivery options and schedule dispatch",
backstory="Manages logistics and carrier APIs.",
tools=[shipping_tool],
llm="gpt-4o-mini"
)
# CrewAI's hierarchical process lets the manager delegate dynamically
crew = Crew(
agents=[inventory_agent, payment_agent, shipping_agent],
tasks=[Task(description="Process order #{order_id}", agent=orchestrator)],
process=Process.hierarchical,
manager_llm="gpt-4o"
)
result = crew.kickoff()| Framework | Implementation |
|---|---|
| LangGraph | Supervisor pattern with sub-graphs |
| CrewAI | Process.hierarchical with manager_llm |
| AutoGen | GroupChat with GroupChatManager |
| Google ADK | LlmAgent with sub_agents and transfer_to_agent |
| Anthropic Agent SDK | Parent agent spawns subagents via Agent tool |
3. Parallel / Fan-Out → Fan-In
The speed pattern. When independent sub-tasks don’t depend on each other, fire them all at once, then merge results. Unlike Orchestrator-Worker, there’s no smart manager — a simple splitter distributes pre-defined tasks, and a simple aggregator combines results.
When to use: All sub-tasks are known upfront and independent, latency is critical, you’re gathering data from multiple sources.
When NOT to use: Tasks have dependencies on each other, or you need dynamic task creation.
Sample Use Case: A competitive analysis tool — Agent 1 scrapes websites, Agent 2 pulls financial data, Agent 3 gathers social sentiment, Agent 4 searches news. None need each other’s output. Total time = slowest agent, not the sum of all four.
# Parallel Fan-Out with Python asyncio + LangGraph Send API
import asyncio
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
async def scrape_websites(query: str) -> str:
"""Agent 1: Scrape competitor websites."""
return await llm.ainvoke(f"Summarize competitor info for: {query}")
async def pull_financials(query: str) -> str:
"""Agent 2: Pull financial data from APIs."""
return await llm.ainvoke(f"Get financial summary for: {query}")
async def gather_sentiment(query: str) -> str:
"""Agent 3: Analyze social media sentiment."""
return await llm.ainvoke(f"Analyze social sentiment for: {query}")
async def search_news(query: str) -> str:
"""Agent 4: Search recent news articles."""
return await llm.ainvoke(f"Find recent news about: {query}")
async def competitive_analysis(query: str) -> str:
"""Fan-out to all agents, fan-in to aggregate."""
# Fan-out: all agents run concurrently
results = await asyncio.gather(
scrape_websites(query),
pull_financials(query),
gather_sentiment(query),
search_news(query)
)
# Fan-in: aggregate results
combined = "\n\n".join([r.content for r in results])
final = await llm.ainvoke(
f"Synthesize this competitive analysis:\n{combined}"
)
return final.content
report = asyncio.run(competitive_analysis("Tesla Q1 2026"))| Framework | Implementation |
|---|---|
| LangGraph | Parallel branches with fan-in node, Send API |
| CrewAI | async_execution=True on tasks |
| AutoGen | a_initiate_chats with concurrent execution |
| Google ADK | ParallelAgent (native workflow agent) |
| Anthropic Agent SDK | Multiple Agent tool calls in a single message |
4. Reflection / Self-Critique
One agent generates, another reviews, and they iterate until the output meets a quality bar. This gives your AI a built-in editor.
When to use: Code generation (write → test → fix cycles), content creation where first drafts aren’t good enough, any task with clear quality criteria.
When NOT to use: Real-time responses needed (iteration adds latency), first pass is usually correct, or no clear “good enough” criteria.
Sample Use Case: An automated code review system — the Generator writes a function, the Reviewer runs tests and checks for edge cases. If anything fails, feedback goes back. Repeat until all tests pass.
# Reflection / Self-Critique Loop
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")
MAX_ITERATIONS = 5
def generate(task: str, feedback: str = "") -> str:
"""Generator agent: produce or revise code."""
context = f"Previous feedback: {feedback}\n" if feedback else ""
return llm.invoke(
f"{context}Write a Python function that: {task}"
).content
def review(code: str) -> dict:
"""Reviewer agent: evaluate code quality."""
response = llm.invoke(
f"""Review this code. Return JSON with:
- "approved": true/false
- "score": 1-10
- "feedback": specific improvement suggestions
Code:
```python
{code}
```"""
).content
return json.loads(response)
def reflection_loop(task: str) -> str:
"""Run the generate-review loop until quality threshold."""
code = generate(task)
for i in range(MAX_ITERATIONS):
review_result = review(code)
if review_result["approved"] and review_result["score"] >= 8:
print(f"✅ Approved after {i+1} iteration(s)")
return code
print(f"🔄 Iteration {i+1}: Score {review_result['score']}/10")
code = generate(task, feedback=review_result["feedback"])
return code # Return best effort after max iterations
final_code = reflection_loop("sort a list using merge sort")| Framework | Implementation |
|---|---|
| LangGraph | Cycles with conditional stop |
| CrewAI | guardrail for validation, max_iter for retries |
| AutoGen | register_nested_chats for inner critic |
| Google ADK | LoopAgent with max_iterations and escalate=True |
| Anthropic Agent SDK | Claude self-evaluates through its reasoning loop |
5. Router / Dispatch
A lightweight router classifies the input once and sends it to the best-fit specialist. That’s it — the router’s job is done. It’s a traffic cop, not a project manager.
How is this different from Orchestrator-Worker? The router makes one decision (which agent?) and steps aside. It doesn’t break tasks into sub-tasks, wait for results, or aggregate anything. Use Router when the task goes to one specialist. Use Orchestrator when the task needs to be split across multiple specialists.
When to use: Diverse input types each needing a single specialist, cost optimization (route simple queries to cheaper models), customer support and helpdesks.
When NOT to use: Task needs splitting across multiple agents, all queries need the same processing, or you only have one specialist.
Sample Use Case: A SaaS customer support system — the Router classifies tickets as billing, technical, or feature-request, and each goes entirely to the appropriate specialist.
# Router / Dispatch Pattern
from langchain_openai import ChatOpenAI
from enum import Enum
class TicketType(Enum):
BILLING = "billing"
TECHNICAL = "technical"
SALES = "sales"
# Lightweight router — uses a small, fast model
router_llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
# Specialist agents — can use different models/tools
billing_llm = ChatOpenAI(model="gpt-4o-mini")
technical_llm = ChatOpenAI(model="gpt-4o") # Harder problems, better model
sales_llm = ChatOpenAI(model="gpt-4o-mini")
def route(query: str) -> TicketType:
"""Classify the query and return the ticket type."""
response = router_llm.invoke(
f"""Classify this customer query into exactly one category:
- billing: payment, invoices, subscriptions, pricing
- technical: bugs, errors, how-to, integrations
- sales: upgrades, new features, enterprise plans
Query: {query}
Category:"""
).content.strip().lower()
return TicketType(response)
def dispatch(query: str) -> str:
"""Route to the appropriate specialist agent."""
ticket_type = route(query)
agents = {
TicketType.BILLING: billing_llm,
TicketType.TECHNICAL: technical_llm,
TicketType.SALES: sales_llm,
}
specialist = agents[ticket_type]
return specialist.invoke(
f"You are a {ticket_type.value} specialist. Help with: {query}"
).content
answer = dispatch("My API keeps returning 429 errors")| Framework | Implementation |
|---|---|
| LangGraph | Conditional edges with routing functions |
| CrewAI | @router decorator in Flows, ConditionalTask |
| AutoGen | Custom speaker_selection_method in GroupChat |
| Google ADK | LlmAgent with sub_agents and routing instructions |
| Anthropic Agent SDK | LLM-driven dispatch via Agent tool descriptions |
6. Planning + Execution
Separate the thinking from the doing. A planner agent creates the full roadmap upfront, then executor agents carry out each step. The planner steps back and only re-engages if something fails and requires replanning.
How is this different from Orchestrator-Worker? The Orchestrator is a micromanager — involved at every step. Plan + Execute is more like an architect and builders — the architect draws the blueprint, hands it to builders, and only comes back if the foundation cracks. Planning and execution are clearly separated phases.
When to use: Complex multi-step goals, you need the ability to replan when intermediate results change the approach, research tasks and coding projects.
When NOT to use: Workflow is fixed (use Pipeline), need real-time decision-making at every step (use Orchestrator), or task is too simple to justify a planning step.
Sample Use Case: An automated research assistant — the Planner breaks down “write a market analysis report on EV batteries” into: (1) identify top 5 companies, (2) gather financial data, (3) analyze patents, (4) compare market share, (5) draft report. Executors handle each step. If Step 2 reveals a major company was missed, the Planner revises.
# Planning + Execution Pattern
from langchain_openai import ChatOpenAI
import json
planner_llm = ChatOpenAI(model="gpt-4o", temperature=0)
executor_llm = ChatOpenAI(model="gpt-4o-mini")
def create_plan(goal: str) -> list[dict]:
"""Planner agent: decompose goal into ordered steps."""
response = planner_llm.invoke(
f"""Break this goal into 3-7 concrete steps.
Return JSON array of objects with "step" and "description".
Goal: {goal}"""
).content
return json.loads(response)
def execute_step(step: dict, context: str = "") -> str:
"""Executor agent: carry out a single step."""
return executor_llm.invoke(
f"""Previous context: {context}
Execute this step: {step['description']}
Be thorough and specific."""
).content
def should_replan(step_result: str, remaining_steps: list) -> bool:
"""Check if results require replanning."""
response = planner_llm.invoke(
f"""Given this result: {step_result}
And remaining plan: {json.dumps(remaining_steps)}
Should we replan? Reply YES or NO with brief reason."""
).content
return "YES" in response.upper()
def plan_and_execute(goal: str) -> str:
"""Full plan-and-execute loop with replanning."""
steps = create_plan(goal)
context = ""
i = 0
while i < len(steps):
result = execute_step(steps[i], context)
context += f"\nStep {i+1} result: {result}"
if i < len(steps) - 1 and should_replan(result, steps[i+1:]):
print(f"🔄 Replanning after step {i+1}")
steps = steps[:i+1] + create_plan(
f"Continue from: {context}\nOriginal goal: {goal}"
)
i += 1
return context
report = plan_and_execute("Write a market analysis of EV batteries")| Framework | Implementation |
|---|---|
| LangGraph | Plan-and-execute pattern (well-documented) |
| CrewAI | Task context dependencies + Flows for planning |
| AutoGen | Planner + executor via two-agent or GroupChat |
| Google ADK | Compose with SequentialAgent + LoopAgent |
| Anthropic Agent SDK | Claude naturally plans and executes through its loop |
7. Handoff
Sometimes an agent knows it’s out of its depth. Instead of hallucinating, it hands off the conversation — with full context — to a more capable agent or a human.
When to use: Customer service with escalation tiers, multi-domain assistants where scope changes mid-conversation, human-in-the-loop workflows.
When NOT to use: A single agent can handle everything, or you prefer routing upfront rather than mid-conversation handoffs.
Sample Use Case: A healthcare triage bot — Agent A handles general wellness questions. When it detects symptoms needing medical advice, it hands off to Agent B (medically-trained, with access to clinical guidelines). If urgent, Agent B escalates to Agent C — a human doctor — with the full conversation history.
# Handoff Pattern
from dataclasses import dataclass
@dataclass
class ConversationState:
messages: list
current_agent: str
metadata: dict
def general_agent(state: ConversationState) -> ConversationState:
"""Agent A: Handle general queries, detect escalation needs."""
response = llm.invoke(
f"""You are a general wellness assistant.
If the query involves symptoms, medications, or urgent health
concerns, respond with HANDOFF: medical_agent.
If you can handle it, respond normally.
Conversation: {state.messages}"""
).content
if "HANDOFF:" in response:
target = response.split("HANDOFF:")[1].strip()
state.current_agent = target
state.messages.append({
"role": "system",
"content": f"Handed off to {target} with full context"
})
else:
state.messages.append({"role": "assistant", "content": response})
return state
def medical_agent(state: ConversationState) -> ConversationState:
"""Agent B: Specialized medical guidance with escalation."""
response = llm.invoke(
f"""You are a medical advisor with access to clinical guidelines.
Review the FULL conversation history for context.
If the situation is urgent, respond with HANDOFF: human_doctor.
Full history: {state.messages}"""
).content
if "HANDOFF:" in response:
state.current_agent = "human_doctor"
state.messages.append({
"role": "system",
"content": "Escalated to human doctor — urgent case"
})
else:
state.messages.append({"role": "assistant", "content": response})
return state
# Dispatch loop
agents = {
"general": general_agent,
"medical_agent": medical_agent,
}
def run_conversation(query: str):
state = ConversationState(
messages=[{"role": "user", "content": query}],
current_agent="general",
metadata={}
)
while state.current_agent in agents:
state = agents[state.current_agent](state)
return state| Framework | Implementation |
|---|---|
| LangGraph | Conditional edges with full state transfer |
| CrewAI | allow_delegation=True for autonomous delegation |
| AutoGen | register_hand_off in Swarm pattern (v0.4+) |
| Google ADK | transfer_to_agent with session state context |
| Anthropic Agent SDK | Subagents return to parent (no direct peer handoff) |
8. Evaluator-Optimizer Loop
Generate multiple candidates, score them, and use the feedback to generate better ones. It’s evolution in action — each iteration gets closer to optimal.
When to use: You can define a clear scoring function, prompt optimization, query refinement, code optimization where you can measure performance.
When NOT to use: No clear evaluation criteria, you need a single quick answer, or generation cost is too high for multiple candidates.
Sample Use Case: An ad copy optimizer — the Generator creates 10 variations, the Evaluator scores each on readability, brand alignment, and CTA strength. The top 3 go back to the Generator with feedback. Repeat for 3 rounds and pick the winner.
# Evaluator-Optimizer Loop
from langchain_openai import ChatOpenAI
import json
generator_llm = ChatOpenAI(model="gpt-4o", temperature=0.9)
evaluator_llm = ChatOpenAI(model="gpt-4o", temperature=0)
def generate_candidates(brief: str, n: int = 5, feedback: str = "") -> list:
"""Generator: produce N candidate outputs."""
context = f"Previous feedback: {feedback}\n" if feedback else ""
response = generator_llm.invoke(
f"""{context}Generate {n} different ad copy variations for:
{brief}
Return as JSON array of strings."""
).content
return json.loads(response)
def evaluate(candidates: list, criteria: dict) -> list[dict]:
"""Evaluator: score each candidate on multiple criteria."""
response = evaluator_llm.invoke(
f"""Score each candidate on these criteria (1-10 each):
{json.dumps(criteria)}
Candidates: {json.dumps(candidates)}
Return JSON array of objects with "text", "scores", "total", "feedback"
Sorted by total score descending."""
).content
return json.loads(response)
def evaluator_optimizer(brief: str, rounds: int = 3) -> str:
"""Run the full eval-optimize loop."""
criteria = {
"readability": "Clear, concise, easy to understand",
"brand_voice": "Matches professional but friendly tone",
"cta_strength": "Compelling call to action",
"emotional_appeal": "Creates urgency or desire"
}
feedback = ""
best_candidate = None
for round_num in range(rounds):
candidates = generate_candidates(brief, n=5, feedback=feedback)
scored = evaluate(candidates, criteria)
best = scored[0]
best_candidate = best["text"]
feedback = f"Top scorer ({best['total']}/40): {best['text']}\n"
feedback += f"Improve on: {best['feedback']}"
print(f"Round {round_num+1}: Best score = {best['total']}/40")
return best_candidate
winner = evaluator_optimizer("Launch ad for an AI-powered code review tool")| Framework | Implementation |
|---|---|
| LangGraph | Cycles with scoring nodes and conditional exit |
| CrewAI | Task output validation + guardrail + max_iter |
| AutoGen | register_nested_chats for evaluation before reply |
| Google ADK | LoopAgent with evaluator LlmAgent |
| Anthropic Agent SDK | Custom scoring tool, Claude iterates through loop |
Bonus: Emerging Patterns
Three patterns that are powerful in specific scenarios but less commonly deployed in production today.
Debate / Adversarial: Set up agents on opposing sides and let them challenge each other. A judge agent listens to both and makes the final call. Best for high-stakes decisions, fact verification, and red-teaming. Sample use case: an investment system where a Bull agent argues for, a Bear agent argues against, and a Judge synthesizes.
Multi-Agent Group Chat: Multiple agents sit in a shared conversation, each contributing from their expertise. Best for brainstorming and simulating cross-functional team discussions. Expensive — every agent reads every message. AutoGen’s GroupChat + GroupChatManager has the best native support here.
Mixture of Agents (MoA): Use multiple different models to generate diverse responses, then refine and aggregate. Different models have different strengths — together, they’re better than any one alone. Best for maximum accuracy when cost and latency aren’t constraints. Sample use case: a legal contract review where Claude, GPT-4, and Gemini each analyze from different angles, and an aggregator synthesizes.
The Decision Framework
Das closes with a simple decision framework that maps your core need to the right pattern:
| Need | Pattern |
|---|---|
| Can one agent handle it? | Don’t use multi-agent. Keep it simple. |
| Specialization? | Orchestrator-Worker or Router |
| Speed? | Parallel Fan-Out |
| Accuracy? | Reflection, Debate, or Mixture of Agents |
| Adaptability? | Plan + Execute |
| Graceful escalation? | Handoff |
Combining Patterns
In practice, production systems rarely use a single pattern in isolation. Das highlights common combinations: Router + Reflection (route to specialists, each with a quality loop), Orchestrator + Parallel + Reflection (decompose, fan out, each worker self-critiques, then aggregate), Plan + Execute + Debate (plan, execute, and use debate at critical decision points), and Router + Handoff + Human-in-the-Loop (classify, handle, escalate to humans if confidence is low).
Rubber-Ducking the Jargon
Orchestrator: A manager agent that decomposes tasks, delegates to workers, monitors progress, and reassembles results. Unlike a simple splitter, it reasons and adapts.
Fan-Out / Fan-In: Scatter-gather. Fan-out = split a task into parallel sub-tasks. Fan-in = collect and merge results. Borrowed from distributed systems terminology.
Reflection: An agent loop where output is reviewed and iteratively improved. The “reviewer” can be the same agent with a different prompt, or a separate specialized agent.
Router / Dispatch: A classification step that sends each input to exactly one specialist. Makes one decision and exits — no coordination, no aggregation.
Handoff: Mid-conversation transfer from one agent to another, including full context/state. Distinguished from routing (which happens upfront) by occurring after an agent realizes it can’t complete the task.
Mixture of Agents (MoA): Ensemble approach using multiple different LLMs, inspired by “Mixture-of-Agents Enhances Large Language Model Capabilities” (2024). Each model contributes unique strengths.
What to Watch Out For
This is a practitioner’s taxonomy, not an academic one. Das is writing from production experience, not from a formal analysis of multi-agent theory. The patterns are presented as clean categories, but real systems are messy — and as the “Combining Patterns” section acknowledges, you’ll usually blend two or three.
The framework comparisons are useful but will date quickly — all five frameworks are under active development. The article also doesn’t address cost analysis or failure modes in depth. Running multiple agents means multiplying API calls, and coordination failures (agents misunderstanding each other, infinite loops in reflection, routing errors) are real production concerns.
Finally, the article is a Medium blog post, not a peer-reviewed paper. It’s well-sourced and practical, but treat it as an experienced engineer’s field guide rather than a formal reference.
So What?
If you’re building AI systems beyond a single chatbot, this taxonomy gives you the vocabulary and decision framework to choose your architecture deliberately. The practical takeaways: start with the simplest pattern that could work (often a single agent), escalate to Pipeline or Router when you need structure, and only reach for Orchestrator-Worker or Plan + Execute when you genuinely need dynamic coordination. The most common mistake is over-engineering — reaching for multi-agent when good tool use and a clean single-agent flow would suffice.
The code examples for each pattern give you a starting template you can adapt to your framework of choice. And the decision framework on page 25 is worth printing out and sticking on your monitor.
Reproduction & Implementation
Environment Setup
# Core dependencies
pip install langchain langchain-openai langgraph
# Framework-specific (pick your stack)
pip install crewai # CrewAI
pip install pyautogen # AutoGen
pip install google-adk # Google ADK
# Set your API key
export OPENAI_API_KEY="sk-..."Pattern Selection Pseudo-Code
def select_pattern(requirements: dict) -> str:
"""Decision framework from the article."""
if requirements.get("single_agent_sufficient"):
return "No multi-agent needed"
if requirements.get("fixed_ordered_stages"):
return "Sequential / Pipeline"
if requirements.get("diverse_input_types"):
if requirements.get("one_specialist_per_input"):
return "Router / Dispatch"
if requirements.get("independent_subtasks"):
if requirements.get("latency_critical"):
return "Parallel Fan-Out → Fan-In"
if requirements.get("dynamic_decomposition"):
if requirements.get("realtime_coordination"):
return "Orchestrator-Worker"
else:
return "Planning + Execution"
if requirements.get("quality_iteration"):
if requirements.get("clear_scoring_function"):
return "Evaluator-Optimizer Loop"
else:
return "Reflection / Self-Critique"
if requirements.get("escalation_tiers"):
return "Handoff"
if requirements.get("high_stakes_decisions"):
return "Debate / Adversarial"
return "Start with single agent + good tools"Resources & Links
Original Article:
Framework Documentation:
Pattern Deep Dives:
- LangGraph Multi-Agent Patterns
- Google ADK Multi-Agent Systems
- AutoGen Conversation Patterns
- Anthropic: Building Effective Agents
Research Papers: