Session 1: When to Use Multiple Agents
Synopsis
Explores the motivations for multi-agent design, including specialization, scalability, modularity, and parallelism. Learners understand that multi-agent systems are useful only when they solve problems better than simpler alternatives.
Session Content
Session 1: When to Use Multiple Agents
Session Overview
Duration: ~45 minutes
Audience: Python developers with basic programming knowledge and beginner-level familiarity with GenAI
Session Goal: Understand when a multi-agent design is useful, when it is unnecessary, and how to implement a simple multi-agent workflow using the OpenAI Responses API with Python.
Learning Objectives
By the end of this session, learners will be able to:
- Explain what an AI agent is in practical software terms
- Distinguish between single-agent and multi-agent designs
- Identify situations where multiple agents improve clarity, reliability, or maintainability
- Recognize cases where multiple agents add needless complexity
- Build a small Python prototype with specialized agents using the OpenAI Responses API
- Evaluate trade-offs such as cost, latency, coordination overhead, and debugging complexity
1. Introduction: What Is an Agent?
In GenAI applications, an agent is typically an LLM-powered component that is given:
- A role
- A goal
- Instructions
- Sometimes tools, memory, or access to external systems
An agent can:
- Interpret user requests
- Plan actions
- Generate content
- Evaluate outputs
- Route work to other components
- Use tools or APIs
Simple Mental Model
Think of an agent as a specialized software worker.
Examples:
- A Research Agent gathers relevant facts
- A Writer Agent turns facts into readable prose
- A Reviewer Agent checks quality and consistency
- A Router Agent decides which path to use
Single-Agent Pattern
A single LLM handles the entire task.
Example:
“Read this support request, classify urgency, draft a reply, and summarize the issue.”
This is often the best place to start.
Multi-Agent Pattern
The task is split across multiple specialized LLM-powered components.
Example:
- Agent 1: classify the support issue
- Agent 2: draft the customer-facing reply
- Agent 3: check for policy compliance
This can improve modularity, but it also introduces coordination costs.
2. Why Not Use Multiple Agents by Default?
Multi-agent systems are appealing because they mirror human teams. But in software, more moving parts usually mean more complexity.
Costs of Multi-Agent Design
1. Higher Latency
If three agents run in sequence, the user waits for three model calls instead of one.
2. Higher Cost
Each agent call consumes tokens and API usage.
3. More Failure Modes
Problems can occur due to:
- Miscommunication between agents
- Poor intermediate outputs
- Conflicting instructions
- Context loss across handoffs
4. Harder Debugging
A bad final result may come from:
- the planner
- the researcher
- the writer
- the evaluator
- the orchestration logic
5. Over-Engineering Risk
Many tasks do not need multiple agents. A well-prompted single agent often performs better than a poorly coordinated team.
Default Recommendation
Start with:
- A single-agent solution
- Add structured output
- Add tools
- Add retrieval
- Only then consider multiple agents if there is a clear benefit
3. When Multiple Agents Are Useful
Use multiple agents when specialization creates meaningful value.
A. Tasks Have Distinct Sub-Problems
If the work naturally decomposes into different skills, multiple agents can help.
Example: Competitive analysis
- Research Agent gathers market facts
- Summarizer Agent condenses findings
- Strategy Agent proposes actions
These are related but distinct cognitive tasks.
B. You Need Separation of Concerns
If different steps should be independently testable or replaceable, agent separation can help.
Example: Content pipeline
- Outline Agent creates structure
- Draft Agent writes content
- Review Agent checks style and policy
This makes each component easier to improve over time.
C. Different Instructions Improve Reliability
Sometimes one prompt trying to do everything creates conflicts.
Example:
A single prompt saying:
- be creative
- be concise
- be compliant
- be critical
- be persuasive
This can lead to muddled outputs.
Instead, separate roles: - Creator Agent writes - Critic Agent reviews - Editor Agent refines
D. You Need Explicit Verification
A dedicated evaluator or reviewer agent can catch issues before final output.
Example:
For a code explanation app:
- Explainer Agent explains the code
- QA Agent checks whether the explanation matches the source
This is especially useful when correctness matters.
E. Parallel Work Is Valuable
If tasks can run independently, multiple agents may reduce total time.
Example:
A report generator that separately analyzes:
- financial trends
- product metrics
- customer sentiment
These can later be merged into one report.
F. Agent Roles Map Cleanly to Business Logic
If your product already has natural workflow stages, multiple agents may align well with the domain.
Examples: - triage → diagnose → recommend - research → draft → review - detect → investigate → escalate
4. When a Single Agent Is Usually Better
A single agent is often enough when:
- The task is short and direct
- The reasoning chain is not too long
- One consistent style is desired
- There is little benefit from specialization
- Low latency matters
- Budget is constrained
- The intermediate outputs are not useful on their own
Examples Better as Single-Agent Tasks
- Summarize a meeting transcript
- Draft a polite email response
- Classify support tickets into a few categories
- Convert notes into a blog outline
- Extract structured data from text
Rule of Thumb
If you cannot clearly explain:
- what each agent does,
- why it needs a separate role,
- and how that improves results,
then you probably do not need multiple agents.
5. Common Multi-Agent Patterns
Pattern 1: Sequential Pipeline
One agent’s output becomes the next agent’s input.
Flow:
Research → Draft → Review
Best for:
Structured workflows with clear stages
Pros: Easy to understand
Cons: Slow if many steps are chained
Pattern 2: Router Pattern
A router agent decides which specialist agent should handle the task.
Flow:
Router → Billing Agent / Technical Agent / Sales Agent
Best for:
Mixed-intent systems
Pros: Efficient specialization
Cons: Routing mistakes can hurt quality
Pattern 3: Generator-Critic Pattern
One agent creates, another evaluates.
Flow:
Writer → Critic → Revised Writer
Best for:
Quality-sensitive outputs
Pros: Better reliability
Cons: More calls and orchestration
Pattern 4: Parallel Specialists + Merger
Multiple agents work independently, then another agent combines the results.
Flow:
Agent A + Agent B + Agent C → Synthesizer
Best for:
Independent analysis tracks
Pros: Good decomposition
Cons: Requires strong synthesis step
6. Decision Framework: Should You Use Multiple Agents?
Use this checklist before introducing more agents.
Ask These Questions
1. Is the task naturally separable?
If yes, multi-agent may help.
2. Do the subtasks require meaningfully different instructions?
If yes, specialization may improve performance.
3. Would intermediate outputs be useful on their own?
If yes, a pipeline may be valuable.
4. Do you need an explicit review or safety layer?
If yes, add a reviewer or evaluator agent.
5. Can you afford extra latency and cost?
If no, keep it simpler.
6. Will orchestration complexity outweigh benefits?
If yes, avoid multi-agent design.
Practical Guideline
Use multiple agents when you need at least one of these:
- specialization
- routing
- verification
- modularity
- parallel analysis
Otherwise, prefer one agent.
7. Architecture Example: Customer Support Assistant
Let’s compare two designs.
Option A: Single Agent
Input: Customer message
Output: Category, urgency, and response draft
This is simple and often sufficient.
Option B: Multi-Agent
- Classifier Agent: category + urgency
- Reply Agent: draft response based on category
- Policy Reviewer Agent: check tone and policy alignment
When Option B Is Better
Use Option B if:
- categories strongly affect response strategy
- policy compliance matters
- you want to audit intermediate decisions
- each step changes frequently and should be updated independently
When Option A Is Better
Use Option A if:
- support volume is low
- stakes are moderate
- speed matters more than modularity
- a single prompt already works well
8. Hands-On Exercise 1: Build a Single-Agent Baseline
Goal
Create a simple single-agent support assistant that:
- classifies the support request
- estimates urgency
- drafts a customer reply
This baseline helps us compare against a multi-agent design later.
Prerequisites
Install the OpenAI Python SDK:
pip install openai python-dotenv
Create a .env file:
OPENAI_API_KEY=your_api_key_here
Code: Single-Agent Baseline
import json
import os
from dotenv import load_dotenv
from openai import OpenAI
# Load environment variables from .env
load_dotenv()
# Create an OpenAI client using the API key from environment variables
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
# Example customer message to process
customer_message = """
Hi team,
I've been charged twice for my monthly subscription.
I need this fixed today because our finance team is closing books.
Thanks.
""".strip()
# Define a system prompt with a clear role and expected output format
system_prompt = """
You are a helpful customer support assistant.
Your job:
1. Classify the support issue into one of these categories:
- billing
- technical
- account
- general
2. Estimate urgency as one of:
- low
- medium
- high
3. Draft a short, polite response to the customer.
Return valid JSON with exactly these keys:
- category
- urgency
- reply
""".strip()
# Call the Responses API
response = client.responses.create(
model="gpt-5.4-mini",
input=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": customer_message},
],
)
# The SDK provides a convenience property for the text output
raw_text = response.output_text
# Parse the returned JSON
result = json.loads(raw_text)
print("Single-Agent Result:")
print(json.dumps(result, indent=2))
Example Output
{
"category": "billing",
"urgency": "high",
"reply": "Hi, thanks for reaching out. I’m sorry to hear you were charged twice for your subscription. We understand the urgency and will review the duplicate charge as quickly as possible. Please share any relevant invoice or billing details if available, and our team will work to resolve this today."
}
Discussion
This works well because the task is compact and the outputs are tightly related.
Why this is a good baseline: - simple implementation - one API call - easy to debug - low latency compared to a multi-step workflow
9. Hands-On Exercise 2: Build a Multi-Agent Version
Goal
Refactor the support assistant into multiple specialized agents:
- Classifier Agent
- Reply Agent
- Reviewer Agent
This exercise demonstrates when multiple agents may be useful.
Code: Multi-Agent Support Workflow
import json
import os
from dotenv import load_dotenv
from openai import OpenAI
# Load environment variables
load_dotenv()
# Initialize the OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
customer_message = """
Hi team,
I've been charged twice for my monthly subscription.
I need this fixed today because our finance team is closing books.
Thanks.
""".strip()
def run_agent(system_prompt: str, user_input: str) -> str:
"""
Utility function to send a prompt to the Responses API
and return the text output.
"""
response = client.responses.create(
model="gpt-5.4-mini",
input=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_input},
],
)
return response.output_text.strip()
# -----------------------------
# Agent 1: Classifier
# -----------------------------
classifier_prompt = """
You are a support ticket classifier.
Classify the customer issue into one of:
- billing
- technical
- account
- general
Also assign urgency:
- low
- medium
- high
Return valid JSON with exactly these keys:
- category
- urgency
- reasoning
Keep reasoning to one sentence.
""".strip()
classification_text = run_agent(classifier_prompt, customer_message)
classification = json.loads(classification_text)
print("Classifier Output:")
print(json.dumps(classification, indent=2))
# -----------------------------
# Agent 2: Reply Writer
# -----------------------------
reply_prompt = """
You are a customer support reply assistant.
Write a short, polite, professional response to the customer.
Acknowledge the issue and urgency.
Do not promise a refund unless explicitly confirmed.
Do not invent policy details.
Return valid JSON with exactly this key:
- reply
""".strip()
reply_input = f"""
Customer message:
{customer_message}
Classification:
{json.dumps(classification, indent=2)}
""".strip()
reply_text = run_agent(reply_prompt, reply_input)
reply_result = json.loads(reply_text)
print("\nReply Agent Output:")
print(json.dumps(reply_result, indent=2))
# -----------------------------
# Agent 3: Reviewer
# -----------------------------
review_prompt = """
You are a support quality reviewer.
Review the drafted reply for:
- politeness
- clarity
- consistency with the classification
- avoidance of unsupported promises
Return valid JSON with exactly these keys:
- approved
- feedback
- revised_reply
If the draft is already good, set approved to true and keep revised_reply the same.
If not, set approved to false and provide an improved revised_reply.
""".strip()
review_input = f"""
Customer message:
{customer_message}
Classification:
{json.dumps(classification, indent=2)}
Draft reply:
{reply_result["reply"]}
""".strip()
review_text = run_agent(review_prompt, review_input)
review_result = json.loads(review_text)
print("\nReviewer Output:")
print(json.dumps(review_result, indent=2))
# Final reply selection
final_reply = review_result["revised_reply"]
print("\nFinal Customer Reply:")
print(final_reply)
Example Output
Classifier Output:
{
"category": "billing",
"urgency": "high",
"reasoning": "The customer reports a duplicate subscription charge and indicates the issue is time-sensitive due to finance deadlines."
}
Reply Agent Output:
{
"reply": "Hi, thank you for reaching out, and I’m sorry to hear about the duplicate charge on your subscription. We understand this is urgent, especially with your finance team closing books today. We’ll review the issue as quickly as possible and follow up with next steps shortly."
}
Reviewer Output:
{
"approved": true,
"feedback": "The reply is polite, clear, and does not make unsupported promises.",
"revised_reply": "Hi, thank you for reaching out, and I’m sorry to hear about the duplicate charge on your subscription. We understand this is urgent, especially with your finance team closing books today. We’ll review the issue as quickly as possible and follow up with next steps shortly."
}
Final Customer Reply:
Hi, thank you for reaching out, and I’m sorry to hear about the duplicate charge on your subscription. We understand this is urgent, especially with your finance team closing books today. We’ll review the issue as quickly as possible and follow up with next steps shortly.
10. Exercise Debrief: Was Multi-Agent Better?
Benefits Observed
- The classification logic is isolated
- The reply generation can be improved independently
- The reviewer adds a verification step
- Intermediate outputs are visible and auditable
Costs Observed
- Three model calls instead of one
- More code and orchestration
- More parsing and error handling needed
- More latency
Main Lesson
The multi-agent version is not automatically “better.” It is better only if the added modularity and review step are worth the cost.
11. Hands-On Exercise 3: Add a Router Agent
Goal
Create a small router that sends incoming messages to the right specialist:
- Billing Agent
- Technical Agent
- General Agent
This demonstrates one of the most common real-world multi-agent patterns.
Code: Router-Based Workflow
import json
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
customer_message = """
Hello,
Our team cannot log in to the dashboard after resetting passwords.
Can you help us regain access quickly?
""".strip()
def run_agent(system_prompt: str, user_input: str) -> str:
"""
Send input to the model and return text output.
"""
response = client.responses.create(
model="gpt-5.4-mini",
input=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_input},
],
)
return response.output_text.strip()
# -----------------------------
# Router Agent
# -----------------------------
router_prompt = """
You are a router for support requests.
Choose exactly one route:
- billing
- technical
- general
Return valid JSON with exactly these keys:
- route
- reason
""".strip()
route_result = json.loads(run_agent(router_prompt, customer_message))
print("Router Decision:")
print(json.dumps(route_result, indent=2))
# -----------------------------
# Specialist Agents
# -----------------------------
billing_prompt = """
You are a billing support assistant.
Draft a concise response for billing issues.
Return valid JSON with exactly this key:
- reply
""".strip()
technical_prompt = """
You are a technical support assistant.
Draft a concise response for login, access, or product malfunction issues.
Return valid JSON with exactly this key:
- reply
""".strip()
general_prompt = """
You are a general support assistant.
Draft a concise response for general inquiries.
Return valid JSON with exactly this key:
- reply
""".strip()
specialist_prompts = {
"billing": billing_prompt,
"technical": technical_prompt,
"general": general_prompt,
}
selected_route = route_result["route"]
selected_prompt = specialist_prompts[selected_route]
specialist_reply = json.loads(run_agent(selected_prompt, customer_message))
print("\nSpecialist Reply:")
print(json.dumps(specialist_reply, indent=2))
Example Output
Router Decision:
{
"route": "technical",
"reason": "The user reports login and access issues after password resets."
}
Specialist Reply:
{
"reply": "Hi, thanks for contacting us. I’m sorry your team is having trouble logging in after resetting passwords. This appears to be an access issue, and we recommend verifying the affected account emails and any error messages you are seeing so we can help restore access as quickly as possible."
}
Reflection Questions
- Is the router genuinely useful here?
- Would one support agent have been enough?
- At what scale would specialist routing become worth the complexity?
12. Best Practices for Multi-Agent Design
1. Start Simple
Build a single-agent baseline first.
2. Give Each Agent a Clear, Narrow Role
Avoid overlapping responsibilities.
3. Define Structured Inputs and Outputs
Use JSON where possible so agents hand off predictable data.
4. Keep Prompts Focused
Each agent should have one job.
5. Log Intermediate Outputs
This is essential for debugging.
6. Measure Cost and Latency
Do not assume better architecture without evidence.
7. Add Review Agents Only Where They Add Value
Review loops can improve quality, but they can also slow down the system.
8. Avoid Anthropomorphizing Too Much
Agents are software components, not coworkers. Design them with interfaces, contracts, and tests.
13. Common Mistakes
Mistake 1: Splitting Tasks Too Early
Developers often create many agents before understanding the baseline problem.
Mistake 2: Vague Agent Boundaries
If two agents could do the same job, your architecture is unclear.
Mistake 3: Unstructured Handoffs
Free-form text between agents makes systems brittle.
Mistake 4: No Evaluation Criteria
If you cannot measure whether multi-agent improved the workflow, the extra complexity may not be justified.
Mistake 5: Too Many Sequential Calls
Long chains can become expensive and slow.
14. Quick Comparison Table
| Dimension | Single Agent | Multiple Agents |
|---|---|---|
| Simplicity | High | Lower |
| Latency | Lower | Higher |
| Cost | Lower | Higher |
| Specialization | Limited | Stronger |
| Modularity | Limited | Better |
| Debuggability | Easier at small scale | Better only if well-instrumented |
| Review/Verification | Harder to isolate | Easier to add explicitly |
| Best Use Case | Compact tasks | Decomposable workflows |
15. Mini Practice Prompts
Use the following scenarios and decide whether you would use one agent or multiple agents.
Scenario A
“Summarize a 20-minute meeting transcript into bullet points.”
Suggested answer: Single agent
Scenario B
“Analyze three competitor websites, compare pricing, and produce a final strategy memo.”
Suggested answer: Often multiple agents
Scenario C
“Draft a welcome email for a new customer.”
Suggested answer: Single agent
Scenario D
“Classify legal intake messages, redact sensitive information, and produce a safe internal summary.”
Suggested answer: Often multiple agents, especially if compliance and review matter
16. Session Wrap-Up
Key Takeaways
- Multiple agents are useful when tasks are naturally separable and specialization adds real value.
- A single agent is often the best default.
- Multi-agent systems add cost, latency, and orchestration complexity.
- Good reasons to use multiple agents include:
- routing
- verification
- modularity
- parallel work
- specialized instruction sets
- Always compare against a simple baseline.
Recommended Design Mindset
Start with the simplest thing that works.
Only add agents when there is a clear architectural reason.
17. Useful Resources
- OpenAI Responses API guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
- OpenAI API docs overview: https://platform.openai.com/docs
- OpenAI Python SDK: https://github.com/openai/openai-python
- Python
jsonmodule docs: https://docs.python.org/3/library/json.html python-dotenvdocumentation: https://pypi.org/project/python-dotenv/
18. Suggested Homework
Homework Task 1
Take a single-agent workflow you already know, such as: - summarization - support triage - note cleanup - FAQ generation
Build it first as a single agent, then redesign it as two or three agents.
Homework Task 2
Measure: - number of API calls - code complexity - latency - output quality
Homework Task 3
Write a short reflection: - Which version was easier to build? - Which version was easier to debug? - Did multiple agents meaningfully improve results?
19. End-of-Session Check for Understanding
Answer these questions:
- What is the main downside of multi-agent systems?
- Name two situations where multiple agents are useful.
- Why should you build a single-agent baseline first?
- What is the router pattern?
- What makes an agent boundary “good”?
20. Instructor Notes
Suggested Timing
- 0–5 min: Introduction to agents
- 5–12 min: Single vs multi-agent trade-offs
- 12–20 min: Patterns and decision framework
- 20–28 min: Hands-on Exercise 1
- 28–38 min: Hands-on Exercise 2
- 38–43 min: Router pattern exercise
- 43–45 min: Wrap-up and Q&A
Suggested Discussion Prompt
“Have you seen a software design where modularity looked elegant but was not worth the operational complexity? How does that relate to multi-agent systems?”