Session 3: From Chatbots to Agents
Synopsis
Explains the difference between single-turn prompts, conversational assistants, workflows, and autonomous or semi-autonomous agents. Learners see how planning, memory, and tool usage extend LLMs into agentic systems.
Session Content
Session 3: From Chatbots to Agents
Session Overview
In this session, learners move from basic LLM-powered chat applications to agentic systems that can plan, use tools, remember context, and act toward goals. The focus is on understanding what makes an application an agent rather than just a chatbot, and on implementing simple agent-like patterns in Python using the OpenAI Responses API with gpt-5.4-mini.
Duration
~45 minutes
Learning Objectives
By the end of this session, learners will be able to:
- Explain the difference between a chatbot and an agent.
- Identify key components of agentic systems: goals, memory, tools, control loop, and environment.
- Build a simple multi-step agent loop in Python.
- Use the OpenAI Responses API to let a model decide when to call tools.
- Implement a lightweight task-oriented assistant that behaves more like an agent than a chatbot.
1. Chatbots vs Agents
What is a Chatbot?
A chatbot is typically:
- Reactive
- Prompt/response driven
- Focused on conversation
- Limited to the information in the prompt and conversation history
- Not inherently capable of acting on the world unless explicitly wired to tools
Examples:
- FAQ bot
- Documentation assistant
- Customer support responder
What is an Agent?
An agent is typically:
- Goal-oriented
- Able to decide among actions
- Often capable of using tools
- Designed to operate in a loop: observe → reason → act → observe
- Sometimes equipped with memory or state
- Able to complete multi-step tasks
Examples:
- A meeting scheduling assistant
- A coding assistant that reads files, edits code, and runs tests
- A research assistant that searches, summarizes, and compiles findings
Core Difference
A chatbot mainly answers.
An agent tries to do.
2. Anatomy of an Agentic System
A practical agent often includes the following pieces:
2.1 Goal
The agent needs a clear task.
Examples:
- “Summarize this support inbox.”
- “Find the cheapest flight under these constraints.”
- “Draft a weekly project update from recent notes.”
2.2 State
The system needs to track progress.
Examples:
- User preferences
- Completed steps
- Retrieved information
- Pending tasks
2.3 Tools
Agents often interact with functions or APIs.
Examples:
- Calculator
- Search function
- Weather lookup
- Database query
- File read/write
2.4 Decision-Making
The model helps decide:
- What to do next
- Which tool to use
- When the task is complete
2.5 Control Loop
A common agent loop looks like:
- Receive goal
- Ask model what to do next
- If a tool is needed, call it
- Return tool result to model
- Repeat until done
2.6 Memory
Memory can be:
- Short-term: current conversation/task context
- Long-term: saved preferences or prior knowledge
- Working memory: scratchpad/state for current execution
3. Agent Patterns You’ll Use Often
3.1 Single-Turn Chatbot Pattern
User asks a question, model answers directly.
User -> Model -> Answer
3.2 Tool-Augmented Chat Pattern
The model can call functions when it needs external information.
User -> Model -> Tool Call -> Tool Result -> Model -> Answer
3.3 Agent Loop Pattern
The model repeatedly reasons and acts until the task is complete.
Goal -> Model -> Action -> Result -> Model -> Action -> Result -> Final Output
3.4 Planner-Executor Pattern
The system separates planning from execution.
- Planner creates steps
- Executor carries out each step
- Model updates based on observations
This is useful for larger workflows.
4. When Should You Use an Agent?
Use a simple chatbot when:
- The task is mostly Q&A
- You do not need tool use
- No multi-step workflow is required
Use an agent when:
- The task has multiple steps
- The system needs external data or actions
- The model must decide what to do next
- Tracking progress matters
Important Design Principle
Do not build an agent unless the problem needs one.
Agentic systems are more powerful, but also:
- More complex
- Harder to debug
- More expensive
- More likely to fail in subtle ways
5. First Example: A Simple Chatbot
This first example is intentionally non-agentic. It shows a standard chat-style interaction using the Responses API.
Python Example: Basic Chatbot
from openai import OpenAI
# Create a client using your OPENAI_API_KEY environment variable.
client = OpenAI()
response = client.responses.create(
model="gpt-5.4-mini",
input=[
{
"role": "system",
"content": "You are a helpful Python learning assistant."
},
{
"role": "user",
"content": "Explain what a Python dictionary is in simple terms."
}
]
)
# Print the final text output from the model.
print(response.output_text)
Example Output
A Python dictionary is a way to store information as key-value pairs.
Think of it like a real dictionary:
- the key is like the word
- the value is like the definition
Example:
{"name": "Alice", "age": 30}
Here, "name" and "age" are keys, and "Alice" and 30 are values.
Discussion
This is useful, but it is still just a chatbot:
- No tools
- No planning
- No action loop
- No memory beyond what you pass in
6. Adding Tools: The First Step Toward Agents
To become more agent-like, the system needs the ability to interact with something beyond pure text generation.
Let’s create a small assistant with tools.
Scenario
We want a task assistant that can:
- Look up a fake task list
- Mark tasks as done
- Answer the user using those actions
This creates the foundation for agentic behavior.
7. Hands-On Exercise 1: Tool-Using Task Assistant
Goal
Build a simple assistant that can:
- Check current tasks
- Mark a task complete
- Respond naturally to the user
What You’ll Learn
- How to define tool schemas
- How to inspect tool calls from the model
- How to execute Python functions based on model requests
- How to pass tool results back using the Responses API
Code
import json
from openai import OpenAI
# Initialize the OpenAI client.
client = OpenAI()
# A tiny in-memory task store for demonstration purposes.
TASKS = [
{"id": 1, "title": "Write session notes", "done": False},
{"id": 2, "title": "Review Python code", "done": False},
{"id": 3, "title": "Send project update", "done": True},
]
def get_tasks():
"""
Return all tasks as a dictionary. In a real app, this might query a database.
"""
return {"tasks": TASKS}
def complete_task(task_id):
"""
Mark a task complete by ID.
Returns a structured result describing success or failure.
"""
for task in TASKS:
if task["id"] == task_id:
task["done"] = True
return {
"success": True,
"message": f"Task {task_id} marked as complete.",
"task": task,
}
return {
"success": False,
"message": f"Task {task_id} was not found."
}
# Define the tools the model is allowed to use.
TOOLS = [
{
"type": "function",
"name": "get_tasks",
"description": "Get the current task list.",
"parameters": {
"type": "object",
"properties": {},
"required": [],
"additionalProperties": False,
},
},
{
"type": "function",
"name": "complete_task",
"description": "Mark a task as complete by task ID.",
"parameters": {
"type": "object",
"properties": {
"task_id": {
"type": "integer",
"description": "The ID of the task to complete."
}
},
"required": ["task_id"],
"additionalProperties": False,
},
},
]
def run_tool(tool_name, arguments):
"""
Execute the selected tool with the provided arguments.
"""
if tool_name == "get_tasks":
return get_tasks()
elif tool_name == "complete_task":
return complete_task(arguments["task_id"])
else:
return {"error": f"Unknown tool: {tool_name}"}
# Initial user request.
conversation = [
{
"role": "system",
"content": (
"You are a task assistant. "
"Use tools when necessary to inspect or update tasks. "
"Be concise and helpful."
),
},
{
"role": "user",
"content": "Please mark task 2 as complete and tell me the updated task list."
},
]
# First model call: let the model decide whether it wants to use tools.
response = client.responses.create(
model="gpt-5.4-mini",
input=conversation,
tools=TOOLS,
)
# Collect tool outputs that will be sent back to the model.
tool_messages = []
# Inspect the output items for function calls.
for item in response.output:
if item.type == "function_call":
tool_name = item.name
arguments = json.loads(item.arguments)
# Run the selected tool locally in Python.
result = run_tool(tool_name, arguments)
# Add the tool result in the format expected by the Responses API.
tool_messages.append(
{
"type": "function_call_output",
"call_id": item.call_id,
"output": json.dumps(result),
}
)
# If the model requested tools, send the tool results back for a final answer.
if tool_messages:
followup_response = client.responses.create(
model="gpt-5.4-mini",
previous_response_id=response.id,
input=tool_messages,
tools=TOOLS,
)
print(followup_response.output_text)
else:
# If no tool was called, just print the direct response.
print(response.output_text)
Example Output
Done — task 2 has been marked as complete.
Updated task list:
1. Write session notes — not done
2. Review Python code — done
3. Send project update — done
What Makes This More Agentic?
Compared to a basic chatbot:
- The model can inspect state
- The model can act through tools
- The application executes real logic
- The final answer is grounded in tool results
Still, this is not a full agent loop yet. It is a single tool-use cycle.
8. From Tool Use to Agent Loops
A real agent often needs multiple steps.
Example goal:
“Find incomplete tasks, complete the review-related one, and then summarize what changed.”
This may require:
- Inspect tasks
- Decide which task matches
- Mark it complete
- Confirm updated status
- Report result
This pattern is iterative.
9. Hands-On Exercise 2: Build a Minimal Agent Loop
Goal
Build a loop that allows the model to:
- Decide what tool to call
- Receive tool results
- Continue until it reaches a final answer
Key Idea
Instead of assuming only one tool call round, we allow repeated cycles.
Code
import json
from openai import OpenAI
client = OpenAI()
# Demo task store.
TASKS = [
{"id": 1, "title": "Write session notes", "done": False},
{"id": 2, "title": "Review Python code", "done": False},
{"id": 3, "title": "Send project update", "done": False},
]
def get_tasks():
"""Return all tasks."""
return {"tasks": TASKS}
def complete_task(task_id):
"""Mark a task complete by ID."""
for task in TASKS:
if task["id"] == task_id:
if task["done"]:
return {
"success": True,
"message": f"Task {task_id} was already complete.",
"task": task,
}
task["done"] = True
return {
"success": True,
"message": f"Task {task_id} marked complete.",
"task": task,
}
return {"success": False, "message": f"Task {task_id} not found."}
TOOLS = [
{
"type": "function",
"name": "get_tasks",
"description": "Return the current task list.",
"parameters": {
"type": "object",
"properties": {},
"required": [],
"additionalProperties": False,
},
},
{
"type": "function",
"name": "complete_task",
"description": "Mark a task complete using its numeric ID.",
"parameters": {
"type": "object",
"properties": {
"task_id": {
"type": "integer",
"description": "The task ID to complete."
}
},
"required": ["task_id"],
"additionalProperties": False,
},
},
]
def run_tool(name, arguments):
"""Dispatch tool calls to Python functions."""
if name == "get_tasks":
return get_tasks()
if name == "complete_task":
return complete_task(arguments["task_id"])
return {"error": f"Unsupported tool: {name}"}
def run_agent(user_goal, max_turns=5):
"""
Run a simple agent loop.
The model can repeatedly request tool calls.
After each tool result, we send the result back and allow it to continue.
The loop stops when the model no longer requests tools.
"""
response = client.responses.create(
model="gpt-5.4-mini",
input=[
{
"role": "system",
"content": (
"You are a task agent. "
"Use tools to inspect tasks and update them. "
"Continue until the user's goal is completed, then provide a concise summary."
),
},
{
"role": "user",
"content": user_goal,
},
],
tools=TOOLS,
)
for turn in range(max_turns):
tool_outputs = []
for item in response.output:
if item.type == "function_call":
tool_name = item.name
arguments = json.loads(item.arguments)
result = run_tool(tool_name, arguments)
print(f"[Tool Call] {tool_name}({arguments})")
print(f"[Tool Result] {result}")
tool_outputs.append(
{
"type": "function_call_output",
"call_id": item.call_id,
"output": json.dumps(result),
}
)
# If there are no tool calls, the model is done.
if not tool_outputs:
return response.output_text
# Continue the reasoning loop with tool outputs.
response = client.responses.create(
model="gpt-5.4-mini",
previous_response_id=response.id,
input=tool_outputs,
tools=TOOLS,
)
return "Agent stopped after reaching max_turns without finishing."
if __name__ == "__main__":
goal = "Find the review-related task, complete it, and tell me what changed."
final_answer = run_agent(goal)
print("\nFinal Answer:")
print(final_answer)
Example Output
[Tool Call] get_tasks({})
[Tool Result] {'tasks': [{'id': 1, 'title': 'Write session notes', 'done': False}, {'id': 2, 'title': 'Review Python code', 'done': False}, {'id': 3, 'title': 'Send project update', 'done': False}]}
[Tool Call] complete_task({'task_id': 2})
[Tool Result] {'success': True, 'message': 'Task 2 marked complete.', 'task': {'id': 2, 'title': 'Review Python code', 'done': True}}
Final Answer:
I found the review-related task: "Review Python code" (task 2), marked it as complete, and updated the task list successfully.
Why This Is an Agent Loop
This application now supports:
- Multi-step reasoning
- Dynamic tool selection
- State-aware progress
- Iterative completion of a goal
That is the core of many practical agents.
10. Design Considerations for Agentic Systems
10.1 Keep Tools Narrow and Clear
Bad tool:
- “DoAnything”
Better tools:
search_docsget_taskscomplete_tasksend_email
Small focused tools are easier for the model to use correctly.
10.2 Validate Arguments
Never trust model-generated arguments blindly.
Check:
- Required fields
- Data types
- Allowed values
- Security constraints
10.3 Limit the Loop
Always include protections like:
max_turns- timeouts
- rate limits
- safe fallbacks
10.4 Log Everything
For debugging, log:
- User goal
- Tool calls
- Tool arguments
- Tool results
- Final output
10.5 Keep Humans in the Loop
For risky actions, require confirmation before:
- Sending emails
- Deleting data
- Purchasing items
- Updating production systems
11. Hands-On Exercise 3: Add Simple Memory/State
Goal
Extend the task agent so it remembers a user preference during the session.
Example preference:
- The user likes compact summaries.
This is not long-term memory in a database. It is just lightweight session state.
Why This Matters
Agents often need some working memory to adapt behavior across steps.
Code
import json
from openai import OpenAI
client = OpenAI()
TASKS = [
{"id": 1, "title": "Write session notes", "done": False},
{"id": 2, "title": "Review Python code", "done": False},
]
# Simple in-memory session state.
SESSION_STATE = {
"summary_style": "compact"
}
def get_tasks():
"""Return all tasks."""
return {"tasks": TASKS}
def complete_task(task_id):
"""Mark a task complete."""
for task in TASKS:
if task["id"] == task_id:
task["done"] = True
return {"success": True, "task": task}
return {"success": False, "message": "Task not found"}
TOOLS = [
{
"type": "function",
"name": "get_tasks",
"description": "Get the current tasks.",
"parameters": {
"type": "object",
"properties": {},
"required": [],
"additionalProperties": False,
},
},
{
"type": "function",
"name": "complete_task",
"description": "Mark a task as complete by ID.",
"parameters": {
"type": "object",
"properties": {
"task_id": {"type": "integer"}
},
"required": ["task_id"],
"additionalProperties": False,
},
},
]
def run_tool(name, arguments):
"""Run local tool functions."""
if name == "get_tasks":
return get_tasks()
if name == "complete_task":
return complete_task(arguments["task_id"])
return {"error": f"Unknown tool {name}"}
def run_agent(user_goal, session_state, max_turns=5):
"""
Run the agent with a small amount of session memory.
"""
system_prompt = (
"You are a task agent. "
f"The user's preferred summary style is: {session_state['summary_style']}. "
"Use tools when needed. "
"When finished, respond in the preferred style."
)
response = client.responses.create(
model="gpt-5.4-mini",
input=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_goal},
],
tools=TOOLS,
)
for _ in range(max_turns):
tool_outputs = []
for item in response.output:
if item.type == "function_call":
args = json.loads(item.arguments)
result = run_tool(item.name, args)
tool_outputs.append(
{
"type": "function_call_output",
"call_id": item.call_id,
"output": json.dumps(result),
}
)
if not tool_outputs:
return response.output_text
response = client.responses.create(
model="gpt-5.4-mini",
previous_response_id=response.id,
input=tool_outputs,
tools=TOOLS,
)
return "The agent could not complete the task in time."
if __name__ == "__main__":
user_goal = "Complete task 2 and summarize the result."
result = run_agent(user_goal, SESSION_STATE)
print(result)
Example Output
Task 2 completed successfully. Summary: "Review Python code" is now done.
Discussion
This is a simple example of working memory:
- The user preference is stored separately
- It is injected into the agent’s context
- The final response changes based on remembered state
12. Common Failure Modes in Agents
12.1 Infinite or Wasteful Loops
The model keeps asking for tools when it should stop.
Mitigation: - Limit steps - Add stronger system instructions - Detect repeated calls
12.2 Wrong Tool Selection
The model picks an inappropriate tool.
Mitigation: - Improve tool descriptions - Reduce overlapping tools - Add validation and fallback logic
12.3 Hallucinated Assumptions
The model invents facts instead of checking tools.
Mitigation: - Instruct it to verify task state before acting - Prefer grounded workflows - Require tool use for sensitive operations
12.4 Bad Arguments
The model sends malformed or incomplete inputs.
Mitigation: - Use strict JSON schemas - Validate in Python - Return clear error results
12.5 Premature Completion
The model says it is done before actually finishing the task.
Mitigation: - Ask for explicit verification - Build completion checks into the loop - Return status flags from tools
13. Practical Mental Model
A useful mental model is:
- LLM = reasoning engine
- Tools = actions/sensors
- Python app = controller
- State store = memory
- Loop = agent behavior
The model should not be treated as the whole application.
Instead, your application should orchestrate the model safely and deliberately.
14. Mini Lab
Challenge
Build a small reading-list agent.
It should support these tools:
get_books()mark_book_read(book_id)
User Goal
“Find the unread Python-related book, mark it as read, and summarize the result.”
Suggested Data
BOOKS = [
{"id": 1, "title": "Python Crash Course", "read": False},
{"id": 2, "title": "Deep Learning Basics", "read": False},
{"id": 3, "title": "Effective Testing in Python", "read": False},
]
Stretch Goal
Add a user preference:
"summary_style": "bullets"
Then ask the agent to format the result accordingly.
15. Recap
In this session, you learned that:
- A chatbot mainly responds to prompts.
- An agent is goal-oriented and can take actions.
- Tool use is a major step from chatbot to agent.
- A control loop enables multi-step task completion.
- State and memory help agents behave more intelligently.
- Agentic systems need safeguards like validation, logging, and loop limits.
Useful Resources
- OpenAI Responses API Guide
- OpenAI API Reference
- OpenAI Python SDK
- JSON Schema Reference
- Python
jsonmodule documentation
Suggested Homework
- Modify the agent loop so it logs all tool calls to a list and prints them at the end.
- Add a new tool called
create_task(title)and let the agent create tasks from user requests. - Prevent duplicate completions by returning a clear “already done” status.
- Extend session memory with:
- preferred summary style
- preferred verbosity
- Build a small CLI version where the user can interactively give goals to the agent.
End-of-Session Checkpoint
You are ready for the next session if you can:
- Describe the difference between chatbot and agent in one sentence.
- Explain what a tool call is.
- Write a loop that lets the model call tools multiple times.
- Add a small piece of state or memory to influence agent behavior.
Back to Chapter | Back to Master Plan | Previous Session | Next Session