Session 1: When to Use Multiple Agents

Synopsis

Explores the motivations for multi-agent design, including specialization, scalability, modularity, and parallelism. Learners understand that multi-agent systems are useful only when they solve problems better than simpler alternatives.

Session Content

Session 1: When to Use Multiple Agents

Session Overview

Duration: ~45 minutes
Audience: Python developers with basic programming knowledge and beginner-level familiarity with GenAI
Session Goal: Understand when a multi-agent design is useful, when it is unnecessary, and how to implement a simple multi-agent workflow using the OpenAI Responses API with Python.

Learning Objectives

By the end of this session, learners will be able to:

Explain what an AI agent is in practical software terms
Distinguish between single-agent and multi-agent designs
Identify situations where multiple agents improve clarity, reliability, or maintainability
Recognize cases where multiple agents add needless complexity
Build a small Python prototype with specialized agents using the OpenAI Responses API
Evaluate trade-offs such as cost, latency, coordination overhead, and debugging complexity

1. Introduction: What Is an Agent?

In GenAI applications, an agent is typically an LLM-powered component that is given:

A role
A goal
Instructions
Sometimes tools, memory, or access to external systems

An agent can:

Interpret user requests
Plan actions
Generate content
Evaluate outputs
Route work to other components
Use tools or APIs

Simple Mental Model

Think of an agent as a specialized software worker.
Examples:

A Research Agent gathers relevant facts
A Writer Agent turns facts into readable prose
A Reviewer Agent checks quality and consistency
A Router Agent decides which path to use

Single-Agent Pattern

A single LLM handles the entire task.

Example:
“Read this support request, classify urgency, draft a reply, and summarize the issue.”

This is often the best place to start.

Multi-Agent Pattern

The task is split across multiple specialized LLM-powered components.

Example:
- Agent 1: classify the support issue - Agent 2: draft the customer-facing reply - Agent 3: check for policy compliance

This can improve modularity, but it also introduces coordination costs.

2. Why Not Use Multiple Agents by Default?

Multi-agent systems are appealing because they mirror human teams. But in software, more moving parts usually mean more complexity.

Costs of Multi-Agent Design

1. Higher Latency

If three agents run in sequence, the user waits for three model calls instead of one.

2. Higher Cost

Each agent call consumes tokens and API usage.

3. More Failure Modes

Problems can occur due to:

Miscommunication between agents
Poor intermediate outputs
Conflicting instructions
Context loss across handoffs

4. Harder Debugging

A bad final result may come from:

the planner
the researcher
the writer
the evaluator
the orchestration logic

5. Over-Engineering Risk

Many tasks do not need multiple agents. A well-prompted single agent often performs better than a poorly coordinated team.

Default Recommendation

Start with:

A single-agent solution
Add structured output
Add tools
Add retrieval
Only then consider multiple agents if there is a clear benefit

3. When Multiple Agents Are Useful

Use multiple agents when specialization creates meaningful value.

A. Tasks Have Distinct Sub-Problems

If the work naturally decomposes into different skills, multiple agents can help.

Example: Competitive analysis
- Research Agent gathers market facts - Summarizer Agent condenses findings - Strategy Agent proposes actions

These are related but distinct cognitive tasks.

B. You Need Separation of Concerns

If different steps should be independently testable or replaceable, agent separation can help.

Example: Content pipeline
- Outline Agent creates structure - Draft Agent writes content - Review Agent checks style and policy

This makes each component easier to improve over time.

C. Different Instructions Improve Reliability

Sometimes one prompt trying to do everything creates conflicts.

Example:
A single prompt saying: - be creative - be concise - be compliant - be critical - be persuasive

This can lead to muddled outputs.

Instead, separate roles: - Creator Agent writes - Critic Agent reviews - Editor Agent refines

D. You Need Explicit Verification

A dedicated evaluator or reviewer agent can catch issues before final output.

Example:
For a code explanation app: - Explainer Agent explains the code - QA Agent checks whether the explanation matches the source

This is especially useful when correctness matters.

E. Parallel Work Is Valuable

If tasks can run independently, multiple agents may reduce total time.

Example:
A report generator that separately analyzes: - financial trends - product metrics - customer sentiment

These can later be merged into one report.

F. Agent Roles Map Cleanly to Business Logic

If your product already has natural workflow stages, multiple agents may align well with the domain.

Examples: - triage → diagnose → recommend - research → draft → review - detect → investigate → escalate

4. When a Single Agent Is Usually Better

A single agent is often enough when:

The task is short and direct
The reasoning chain is not too long
One consistent style is desired
There is little benefit from specialization
Low latency matters
Budget is constrained
The intermediate outputs are not useful on their own

Examples Better as Single-Agent Tasks

Summarize a meeting transcript
Draft a polite email response
Classify support tickets into a few categories
Convert notes into a blog outline
Extract structured data from text

Rule of Thumb

If you cannot clearly explain:

what each agent does,
why it needs a separate role,
and how that improves results,

then you probably do not need multiple agents.

5. Common Multi-Agent Patterns

Pattern 1: Sequential Pipeline

One agent’s output becomes the next agent’s input.

Flow:
Research → Draft → Review

Best for:
Structured workflows with clear stages

Pros: Easy to understand
Cons: Slow if many steps are chained

Pattern 2: Router Pattern

A router agent decides which specialist agent should handle the task.

Flow:
Router → Billing Agent / Technical Agent / Sales Agent

Best for:
Mixed-intent systems

Pros: Efficient specialization
Cons: Routing mistakes can hurt quality

Pattern 3: Generator-Critic Pattern

One agent creates, another evaluates.

Flow:
Writer → Critic → Revised Writer

Best for:
Quality-sensitive outputs

Pros: Better reliability
Cons: More calls and orchestration

Pattern 4: Parallel Specialists + Merger

Multiple agents work independently, then another agent combines the results.

Flow:
Agent A + Agent B + Agent C → Synthesizer

Best for:
Independent analysis tracks

Pros: Good decomposition
Cons: Requires strong synthesis step

6. Decision Framework: Should You Use Multiple Agents?

Use this checklist before introducing more agents.

Ask These Questions

1. Is the task naturally separable?

If yes, multi-agent may help.

2. Do the subtasks require meaningfully different instructions?

If yes, specialization may improve performance.

3. Would intermediate outputs be useful on their own?

If yes, a pipeline may be valuable.

4. Do you need an explicit review or safety layer?

If yes, add a reviewer or evaluator agent.

5. Can you afford extra latency and cost?

If no, keep it simpler.

6. Will orchestration complexity outweigh benefits?

If yes, avoid multi-agent design.

Practical Guideline

Use multiple agents when you need at least one of these:

specialization
routing
verification
modularity
parallel analysis

Otherwise, prefer one agent.

7. Architecture Example: Customer Support Assistant

Let’s compare two designs.

Option A: Single Agent

Input: Customer message
Output: Category, urgency, and response draft

This is simple and often sufficient.

Option B: Multi-Agent

Classifier Agent: category + urgency
Reply Agent: draft response based on category
Policy Reviewer Agent: check tone and policy alignment

When Option B Is Better

Use Option B if:

categories strongly affect response strategy
policy compliance matters
you want to audit intermediate decisions
each step changes frequently and should be updated independently

When Option A Is Better

Use Option A if:

support volume is low
stakes are moderate
speed matters more than modularity
a single prompt already works well

8. Hands-On Exercise 1: Build a Single-Agent Baseline

Goal

Create a simple single-agent support assistant that:

classifies the support request
estimates urgency
drafts a customer reply

This baseline helps us compare against a multi-agent design later.

Prerequisites

Install the OpenAI Python SDK:

pip install openai python-dotenv

Create a .env file:

OPENAI_API_KEY=your_api_key_here

Code: Single-Agent Baseline

import json
import os
from dotenv import load_dotenv
from openai import OpenAI

# Load environment variables from .env
load_dotenv()

# Create an OpenAI client using the API key from environment variables
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Example customer message to process
customer_message = """
Hi team,
I've been charged twice for my monthly subscription.
I need this fixed today because our finance team is closing books.
Thanks.
""".strip()

# Define a system prompt with a clear role and expected output format
system_prompt = """
You are a helpful customer support assistant.

Your job:
1. Classify the support issue into one of these categories:
   - billing
   - technical
   - account
   - general
2. Estimate urgency as one of:
   - low
   - medium
   - high
3. Draft a short, polite response to the customer.

Return valid JSON with exactly these keys:
- category
- urgency
- reply
""".strip()

# Call the Responses API
response = client.responses.create(
    model="gpt-5.4-mini",
    input=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": customer_message},
    ],
)

# The SDK provides a convenience property for the text output
raw_text = response.output_text

# Parse the returned JSON
result = json.loads(raw_text)

print("Single-Agent Result:")
print(json.dumps(result, indent=2))

Example Output

{
  "category": "billing",
  "urgency": "high",
  "reply": "Hi, thanks for reaching out. I’m sorry to hear you were charged twice for your subscription. We understand the urgency and will review the duplicate charge as quickly as possible. Please share any relevant invoice or billing details if available, and our team will work to resolve this today."
}

Discussion

This works well because the task is compact and the outputs are tightly related.

Why this is a good baseline: - simple implementation - one API call - easy to debug - low latency compared to a multi-step workflow

9. Hands-On Exercise 2: Build a Multi-Agent Version

Goal

Refactor the support assistant into multiple specialized agents:

Classifier Agent
Reply Agent
Reviewer Agent

This exercise demonstrates when multiple agents may be useful.

Code: Multi-Agent Support Workflow

import json
import os
from dotenv import load_dotenv
from openai import OpenAI

# Load environment variables
load_dotenv()

# Initialize the OpenAI client
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

customer_message = """
Hi team,
I've been charged twice for my monthly subscription.
I need this fixed today because our finance team is closing books.
Thanks.
""".strip()


def run_agent(system_prompt: str, user_input: str) -> str:
    """
    Utility function to send a prompt to the Responses API
    and return the text output.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_input},
        ],
    )
    return response.output_text.strip()


# -----------------------------
# Agent 1: Classifier
# -----------------------------
classifier_prompt = """
You are a support ticket classifier.

Classify the customer issue into one of:
- billing
- technical
- account
- general

Also assign urgency:
- low
- medium
- high

Return valid JSON with exactly these keys:
- category
- urgency
- reasoning

Keep reasoning to one sentence.
""".strip()

classification_text = run_agent(classifier_prompt, customer_message)
classification = json.loads(classification_text)

print("Classifier Output:")
print(json.dumps(classification, indent=2))


# -----------------------------
# Agent 2: Reply Writer
# -----------------------------
reply_prompt = """
You are a customer support reply assistant.

Write a short, polite, professional response to the customer.
Acknowledge the issue and urgency.
Do not promise a refund unless explicitly confirmed.
Do not invent policy details.

Return valid JSON with exactly this key:
- reply
""".strip()

reply_input = f"""
Customer message:
{customer_message}

Classification:
{json.dumps(classification, indent=2)}
""".strip()

reply_text = run_agent(reply_prompt, reply_input)
reply_result = json.loads(reply_text)

print("\nReply Agent Output:")
print(json.dumps(reply_result, indent=2))


# -----------------------------
# Agent 3: Reviewer
# -----------------------------
review_prompt = """
You are a support quality reviewer.

Review the drafted reply for:
- politeness
- clarity
- consistency with the classification
- avoidance of unsupported promises

Return valid JSON with exactly these keys:
- approved
- feedback
- revised_reply

If the draft is already good, set approved to true and keep revised_reply the same.
If not, set approved to false and provide an improved revised_reply.
""".strip()

review_input = f"""
Customer message:
{customer_message}

Classification:
{json.dumps(classification, indent=2)}

Draft reply:
{reply_result["reply"]}
""".strip()

review_text = run_agent(review_prompt, review_input)
review_result = json.loads(review_text)

print("\nReviewer Output:")
print(json.dumps(review_result, indent=2))

# Final reply selection
final_reply = review_result["revised_reply"]

print("\nFinal Customer Reply:")
print(final_reply)

Example Output

Classifier Output:
{
  "category": "billing",
  "urgency": "high",
  "reasoning": "The customer reports a duplicate subscription charge and indicates the issue is time-sensitive due to finance deadlines."
}

Reply Agent Output:
{
  "reply": "Hi, thank you for reaching out, and I’m sorry to hear about the duplicate charge on your subscription. We understand this is urgent, especially with your finance team closing books today. We’ll review the issue as quickly as possible and follow up with next steps shortly."
}

Reviewer Output:
{
  "approved": true,
  "feedback": "The reply is polite, clear, and does not make unsupported promises.",
  "revised_reply": "Hi, thank you for reaching out, and I’m sorry to hear about the duplicate charge on your subscription. We understand this is urgent, especially with your finance team closing books today. We’ll review the issue as quickly as possible and follow up with next steps shortly."
}

Final Customer Reply:
Hi, thank you for reaching out, and I’m sorry to hear about the duplicate charge on your subscription. We understand this is urgent, especially with your finance team closing books today. We’ll review the issue as quickly as possible and follow up with next steps shortly.

10. Exercise Debrief: Was Multi-Agent Better?

Benefits Observed

The classification logic is isolated
The reply generation can be improved independently
The reviewer adds a verification step
Intermediate outputs are visible and auditable

Costs Observed

Three model calls instead of one
More code and orchestration
More parsing and error handling needed
More latency

Main Lesson

The multi-agent version is not automatically “better.” It is better only if the added modularity and review step are worth the cost.

11. Hands-On Exercise 3: Add a Router Agent

Goal

Create a small router that sends incoming messages to the right specialist:

Billing Agent
Technical Agent
General Agent

This demonstrates one of the most common real-world multi-agent patterns.

Code: Router-Based Workflow

import json
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

customer_message = """
Hello,
Our team cannot log in to the dashboard after resetting passwords.
Can you help us regain access quickly?
""".strip()


def run_agent(system_prompt: str, user_input: str) -> str:
    """
    Send input to the model and return text output.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_input},
        ],
    )
    return response.output_text.strip()


# -----------------------------
# Router Agent
# -----------------------------
router_prompt = """
You are a router for support requests.

Choose exactly one route:
- billing
- technical
- general

Return valid JSON with exactly these keys:
- route
- reason
""".strip()

route_result = json.loads(run_agent(router_prompt, customer_message))
print("Router Decision:")
print(json.dumps(route_result, indent=2))


# -----------------------------
# Specialist Agents
# -----------------------------
billing_prompt = """
You are a billing support assistant.
Draft a concise response for billing issues.

Return valid JSON with exactly this key:
- reply
""".strip()

technical_prompt = """
You are a technical support assistant.
Draft a concise response for login, access, or product malfunction issues.

Return valid JSON with exactly this key:
- reply
""".strip()

general_prompt = """
You are a general support assistant.
Draft a concise response for general inquiries.

Return valid JSON with exactly this key:
- reply
""".strip()

specialist_prompts = {
    "billing": billing_prompt,
    "technical": technical_prompt,
    "general": general_prompt,
}

selected_route = route_result["route"]
selected_prompt = specialist_prompts[selected_route]

specialist_reply = json.loads(run_agent(selected_prompt, customer_message))

print("\nSpecialist Reply:")
print(json.dumps(specialist_reply, indent=2))

Example Output

Router Decision:
{
  "route": "technical",
  "reason": "The user reports login and access issues after password resets."
}

Specialist Reply:
{
  "reply": "Hi, thanks for contacting us. I’m sorry your team is having trouble logging in after resetting passwords. This appears to be an access issue, and we recommend verifying the affected account emails and any error messages you are seeing so we can help restore access as quickly as possible."
}

Reflection Questions

Is the router genuinely useful here?
Would one support agent have been enough?
At what scale would specialist routing become worth the complexity?

12. Best Practices for Multi-Agent Design

1. Start Simple

Build a single-agent baseline first.

2. Give Each Agent a Clear, Narrow Role

Avoid overlapping responsibilities.

3. Define Structured Inputs and Outputs

Use JSON where possible so agents hand off predictable data.

4. Keep Prompts Focused

Each agent should have one job.

5. Log Intermediate Outputs

This is essential for debugging.

6. Measure Cost and Latency

Do not assume better architecture without evidence.

7. Add Review Agents Only Where They Add Value

Review loops can improve quality, but they can also slow down the system.

8. Avoid Anthropomorphizing Too Much

Agents are software components, not coworkers. Design them with interfaces, contracts, and tests.

13. Common Mistakes

Mistake 1: Splitting Tasks Too Early

Developers often create many agents before understanding the baseline problem.

Mistake 2: Vague Agent Boundaries

If two agents could do the same job, your architecture is unclear.

Mistake 3: Unstructured Handoffs

Free-form text between agents makes systems brittle.

Mistake 4: No Evaluation Criteria

If you cannot measure whether multi-agent improved the workflow, the extra complexity may not be justified.

Mistake 5: Too Many Sequential Calls

Long chains can become expensive and slow.

14. Quick Comparison Table

Dimension	Single Agent	Multiple Agents
Simplicity	High	Lower
Latency	Lower	Higher
Cost	Lower	Higher
Specialization	Limited	Stronger
Modularity	Limited	Better
Debuggability	Easier at small scale	Better only if well-instrumented
Review/Verification	Harder to isolate	Easier to add explicitly
Best Use Case	Compact tasks	Decomposable workflows

15. Mini Practice Prompts

Use the following scenarios and decide whether you would use one agent or multiple agents.

Scenario A

“Summarize a 20-minute meeting transcript into bullet points.”

Suggested answer: Single agent

Scenario B

“Analyze three competitor websites, compare pricing, and produce a final strategy memo.”

Suggested answer: Often multiple agents

Scenario C

“Draft a welcome email for a new customer.”

Suggested answer: Single agent

Scenario D

“Classify legal intake messages, redact sensitive information, and produce a safe internal summary.”

Suggested answer: Often multiple agents, especially if compliance and review matter

16. Session Wrap-Up

Key Takeaways

Multiple agents are useful when tasks are naturally separable and specialization adds real value.
A single agent is often the best default.
Multi-agent systems add cost, latency, and orchestration complexity.
Good reasons to use multiple agents include:
routing
verification
modularity
parallel work
specialized instruction sets
Always compare against a simple baseline.

Recommended Design Mindset

Start with the simplest thing that works.
Only add agents when there is a clear architectural reason.

17. Useful Resources

OpenAI Responses API guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
OpenAI API docs overview: https://platform.openai.com/docs
OpenAI Python SDK: https://github.com/openai/openai-python
Python json module docs: https://docs.python.org/3/library/json.html
python-dotenv documentation: https://pypi.org/project/python-dotenv/

18. Suggested Homework

Homework Task 1

Take a single-agent workflow you already know, such as: - summarization - support triage - note cleanup - FAQ generation

Build it first as a single agent, then redesign it as two or three agents.

Homework Task 2

Measure: - number of API calls - code complexity - latency - output quality

Homework Task 3

Write a short reflection: - Which version was easier to build? - Which version was easier to debug? - Did multiple agents meaningfully improve results?

19. End-of-Session Check for Understanding

Answer these questions:

What is the main downside of multi-agent systems?
Name two situations where multiple agents are useful.
Why should you build a single-agent baseline first?
What is the router pattern?
What makes an agent boundary “good”?

20. Instructor Notes

Suggested Timing

0–5 min: Introduction to agents
5–12 min: Single vs multi-agent trade-offs
12–20 min: Patterns and decision framework
20–28 min: Hands-on Exercise 1
28–38 min: Hands-on Exercise 2
38–43 min: Router pattern exercise
43–45 min: Wrap-up and Q&A

Suggested Discussion Prompt

“Have you seen a software design where modularity looked elegant but was not worth the operational complexity? How does that relate to multi-agent systems?”

Back to Chapter | Back to Master Plan | Next Session