Session 3: From Chatbots to Agents

Synopsis

Explains the difference between single-turn prompts, conversational assistants, workflows, and autonomous or semi-autonomous agents. Learners see how planning, memory, and tool usage extend LLMs into agentic systems.

Session Content

Session 3: From Chatbots to Agents

Session Overview

In this session, learners move from basic LLM-powered chat applications to agentic systems that can plan, use tools, remember context, and act toward goals. The focus is on understanding what makes an application an agent rather than just a chatbot, and on implementing simple agent-like patterns in Python using the OpenAI Responses API with gpt-5.4-mini.

Duration

~45 minutes

Learning Objectives

By the end of this session, learners will be able to:

Explain the difference between a chatbot and an agent.
Identify key components of agentic systems: goals, memory, tools, control loop, and environment.
Build a simple multi-step agent loop in Python.
Use the OpenAI Responses API to let a model decide when to call tools.
Implement a lightweight task-oriented assistant that behaves more like an agent than a chatbot.

1. Chatbots vs Agents

What is a Chatbot?

A chatbot is typically:

Reactive
Prompt/response driven
Focused on conversation
Limited to the information in the prompt and conversation history
Not inherently capable of acting on the world unless explicitly wired to tools

Examples:

FAQ bot
Documentation assistant
Customer support responder

What is an Agent?

An agent is typically:

Goal-oriented
Able to decide among actions
Often capable of using tools
Designed to operate in a loop: observe → reason → act → observe
Sometimes equipped with memory or state
Able to complete multi-step tasks

Examples:

A meeting scheduling assistant
A coding assistant that reads files, edits code, and runs tests
A research assistant that searches, summarizes, and compiles findings

Core Difference

A chatbot mainly answers.

An agent tries to do.

2. Anatomy of an Agentic System

A practical agent often includes the following pieces:

2.1 Goal

The agent needs a clear task.

Examples:

“Summarize this support inbox.”
“Find the cheapest flight under these constraints.”
“Draft a weekly project update from recent notes.”

2.2 State

The system needs to track progress.

Examples:

User preferences
Completed steps
Retrieved information
Pending tasks

2.3 Tools

Agents often interact with functions or APIs.

Examples:

Calculator
Search function
Weather lookup
Database query
File read/write

2.4 Decision-Making

The model helps decide:

What to do next
Which tool to use
When the task is complete

2.5 Control Loop

A common agent loop looks like:

Receive goal
Ask model what to do next
If a tool is needed, call it
Return tool result to model
Repeat until done

2.6 Memory

Memory can be:

Short-term: current conversation/task context
Long-term: saved preferences or prior knowledge
Working memory: scratchpad/state for current execution

3. Agent Patterns You’ll Use Often

3.1 Single-Turn Chatbot Pattern

User asks a question, model answers directly.

User -> Model -> Answer

3.2 Tool-Augmented Chat Pattern

The model can call functions when it needs external information.

User -> Model -> Tool Call -> Tool Result -> Model -> Answer

3.3 Agent Loop Pattern

The model repeatedly reasons and acts until the task is complete.

Goal -> Model -> Action -> Result -> Model -> Action -> Result -> Final Output

3.4 Planner-Executor Pattern

The system separates planning from execution.

Planner creates steps
Executor carries out each step
Model updates based on observations

This is useful for larger workflows.

4. When Should You Use an Agent?

Use a simple chatbot when:

The task is mostly Q&A
You do not need tool use
No multi-step workflow is required

Use an agent when:

The task has multiple steps
The system needs external data or actions
The model must decide what to do next
Tracking progress matters

Important Design Principle

Do not build an agent unless the problem needs one.

Agentic systems are more powerful, but also:

More complex
Harder to debug
More expensive
More likely to fail in subtle ways

5. First Example: A Simple Chatbot

This first example is intentionally non-agentic. It shows a standard chat-style interaction using the Responses API.

Python Example: Basic Chatbot

from openai import OpenAI

# Create a client using your OPENAI_API_KEY environment variable.
client = OpenAI()

response = client.responses.create(
    model="gpt-5.4-mini",
    input=[
        {
            "role": "system",
            "content": "You are a helpful Python learning assistant."
        },
        {
            "role": "user",
            "content": "Explain what a Python dictionary is in simple terms."
        }
    ]
)

# Print the final text output from the model.
print(response.output_text)

Example Output

A Python dictionary is a way to store information as key-value pairs.

Think of it like a real dictionary:
- the key is like the word
- the value is like the definition

Example:
{"name": "Alice", "age": 30}

Here, "name" and "age" are keys, and "Alice" and 30 are values.

Discussion

This is useful, but it is still just a chatbot:

No tools
No planning
No action loop
No memory beyond what you pass in

6. Adding Tools: The First Step Toward Agents

To become more agent-like, the system needs the ability to interact with something beyond pure text generation.

Let’s create a small assistant with tools.

Scenario

We want a task assistant that can:

Look up a fake task list
Mark tasks as done
Answer the user using those actions

This creates the foundation for agentic behavior.

7. Hands-On Exercise 1: Tool-Using Task Assistant

Goal

Build a simple assistant that can:

Check current tasks
Mark a task complete
Respond naturally to the user

What You’ll Learn

How to define tool schemas
How to inspect tool calls from the model
How to execute Python functions based on model requests
How to pass tool results back using the Responses API

Code

import json
from openai import OpenAI

# Initialize the OpenAI client.
client = OpenAI()

# A tiny in-memory task store for demonstration purposes.
TASKS = [
    {"id": 1, "title": "Write session notes", "done": False},
    {"id": 2, "title": "Review Python code", "done": False},
    {"id": 3, "title": "Send project update", "done": True},
]


def get_tasks():
    """
    Return all tasks as a dictionary. In a real app, this might query a database.
    """
    return {"tasks": TASKS}


def complete_task(task_id):
    """
    Mark a task complete by ID.
    Returns a structured result describing success or failure.
    """
    for task in TASKS:
        if task["id"] == task_id:
            task["done"] = True
            return {
                "success": True,
                "message": f"Task {task_id} marked as complete.",
                "task": task,
            }

    return {
        "success": False,
        "message": f"Task {task_id} was not found."
    }


# Define the tools the model is allowed to use.
TOOLS = [
    {
        "type": "function",
        "name": "get_tasks",
        "description": "Get the current task list.",
        "parameters": {
            "type": "object",
            "properties": {},
            "required": [],
            "additionalProperties": False,
        },
    },
    {
        "type": "function",
        "name": "complete_task",
        "description": "Mark a task as complete by task ID.",
        "parameters": {
            "type": "object",
            "properties": {
                "task_id": {
                    "type": "integer",
                    "description": "The ID of the task to complete."
                }
            },
            "required": ["task_id"],
            "additionalProperties": False,
        },
    },
]


def run_tool(tool_name, arguments):
    """
    Execute the selected tool with the provided arguments.
    """
    if tool_name == "get_tasks":
        return get_tasks()
    elif tool_name == "complete_task":
        return complete_task(arguments["task_id"])
    else:
        return {"error": f"Unknown tool: {tool_name}"}


# Initial user request.
conversation = [
    {
        "role": "system",
        "content": (
            "You are a task assistant. "
            "Use tools when necessary to inspect or update tasks. "
            "Be concise and helpful."
        ),
    },
    {
        "role": "user",
        "content": "Please mark task 2 as complete and tell me the updated task list."
    },
]

# First model call: let the model decide whether it wants to use tools.
response = client.responses.create(
    model="gpt-5.4-mini",
    input=conversation,
    tools=TOOLS,
)

# Collect tool outputs that will be sent back to the model.
tool_messages = []

# Inspect the output items for function calls.
for item in response.output:
    if item.type == "function_call":
        tool_name = item.name
        arguments = json.loads(item.arguments)

        # Run the selected tool locally in Python.
        result = run_tool(tool_name, arguments)

        # Add the tool result in the format expected by the Responses API.
        tool_messages.append(
            {
                "type": "function_call_output",
                "call_id": item.call_id,
                "output": json.dumps(result),
            }
        )

# If the model requested tools, send the tool results back for a final answer.
if tool_messages:
    followup_response = client.responses.create(
        model="gpt-5.4-mini",
        previous_response_id=response.id,
        input=tool_messages,
        tools=TOOLS,
    )
    print(followup_response.output_text)
else:
    # If no tool was called, just print the direct response.
    print(response.output_text)

Example Output

Done — task 2 has been marked as complete.

Updated task list:
1. Write session notes — not done
2. Review Python code — done
3. Send project update — done

What Makes This More Agentic?

Compared to a basic chatbot:

The model can inspect state
The model can act through tools
The application executes real logic
The final answer is grounded in tool results

Still, this is not a full agent loop yet. It is a single tool-use cycle.

8. From Tool Use to Agent Loops

A real agent often needs multiple steps.

Example goal:

“Find incomplete tasks, complete the review-related one, and then summarize what changed.”

This may require:

Inspect tasks
Decide which task matches
Mark it complete
Confirm updated status
Report result

This pattern is iterative.

9. Hands-On Exercise 2: Build a Minimal Agent Loop

Goal

Build a loop that allows the model to:

Decide what tool to call
Receive tool results
Continue until it reaches a final answer

Key Idea

Instead of assuming only one tool call round, we allow repeated cycles.

Code

import json
from openai import OpenAI

client = OpenAI()

# Demo task store.
TASKS = [
    {"id": 1, "title": "Write session notes", "done": False},
    {"id": 2, "title": "Review Python code", "done": False},
    {"id": 3, "title": "Send project update", "done": False},
]


def get_tasks():
    """Return all tasks."""
    return {"tasks": TASKS}


def complete_task(task_id):
    """Mark a task complete by ID."""
    for task in TASKS:
        if task["id"] == task_id:
            if task["done"]:
                return {
                    "success": True,
                    "message": f"Task {task_id} was already complete.",
                    "task": task,
                }
            task["done"] = True
            return {
                "success": True,
                "message": f"Task {task_id} marked complete.",
                "task": task,
            }
    return {"success": False, "message": f"Task {task_id} not found."}


TOOLS = [
    {
        "type": "function",
        "name": "get_tasks",
        "description": "Return the current task list.",
        "parameters": {
            "type": "object",
            "properties": {},
            "required": [],
            "additionalProperties": False,
        },
    },
    {
        "type": "function",
        "name": "complete_task",
        "description": "Mark a task complete using its numeric ID.",
        "parameters": {
            "type": "object",
            "properties": {
                "task_id": {
                    "type": "integer",
                    "description": "The task ID to complete."
                }
            },
            "required": ["task_id"],
            "additionalProperties": False,
        },
    },
]


def run_tool(name, arguments):
    """Dispatch tool calls to Python functions."""
    if name == "get_tasks":
        return get_tasks()
    if name == "complete_task":
        return complete_task(arguments["task_id"])
    return {"error": f"Unsupported tool: {name}"}


def run_agent(user_goal, max_turns=5):
    """
    Run a simple agent loop.

    The model can repeatedly request tool calls.
    After each tool result, we send the result back and allow it to continue.
    The loop stops when the model no longer requests tools.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {
                "role": "system",
                "content": (
                    "You are a task agent. "
                    "Use tools to inspect tasks and update them. "
                    "Continue until the user's goal is completed, then provide a concise summary."
                ),
            },
            {
                "role": "user",
                "content": user_goal,
            },
        ],
        tools=TOOLS,
    )

    for turn in range(max_turns):
        tool_outputs = []

        for item in response.output:
            if item.type == "function_call":
                tool_name = item.name
                arguments = json.loads(item.arguments)
                result = run_tool(tool_name, arguments)

                print(f"[Tool Call] {tool_name}({arguments})")
                print(f"[Tool Result] {result}")

                tool_outputs.append(
                    {
                        "type": "function_call_output",
                        "call_id": item.call_id,
                        "output": json.dumps(result),
                    }
                )

        # If there are no tool calls, the model is done.
        if not tool_outputs:
            return response.output_text

        # Continue the reasoning loop with tool outputs.
        response = client.responses.create(
            model="gpt-5.4-mini",
            previous_response_id=response.id,
            input=tool_outputs,
            tools=TOOLS,
        )

    return "Agent stopped after reaching max_turns without finishing."


if __name__ == "__main__":
    goal = "Find the review-related task, complete it, and tell me what changed."
    final_answer = run_agent(goal)
    print("\nFinal Answer:")
    print(final_answer)

Example Output

[Tool Call] get_tasks({})
[Tool Result] {'tasks': [{'id': 1, 'title': 'Write session notes', 'done': False}, {'id': 2, 'title': 'Review Python code', 'done': False}, {'id': 3, 'title': 'Send project update', 'done': False}]}
[Tool Call] complete_task({'task_id': 2})
[Tool Result] {'success': True, 'message': 'Task 2 marked complete.', 'task': {'id': 2, 'title': 'Review Python code', 'done': True}}

Final Answer:
I found the review-related task: "Review Python code" (task 2), marked it as complete, and updated the task list successfully.

Why This Is an Agent Loop

This application now supports:

Multi-step reasoning
Dynamic tool selection
State-aware progress
Iterative completion of a goal

That is the core of many practical agents.

10. Design Considerations for Agentic Systems

10.1 Keep Tools Narrow and Clear

Bad tool:

“DoAnything”

Better tools:

search_docs
get_tasks
complete_task
send_email

Small focused tools are easier for the model to use correctly.

10.2 Validate Arguments

Never trust model-generated arguments blindly.

Check:

Required fields
Data types
Allowed values
Security constraints

10.3 Limit the Loop

Always include protections like:

max_turns
timeouts
rate limits
safe fallbacks

10.4 Log Everything

For debugging, log:

User goal
Tool calls
Tool arguments
Tool results
Final output

10.5 Keep Humans in the Loop

For risky actions, require confirmation before:

Sending emails
Deleting data
Purchasing items
Updating production systems

11. Hands-On Exercise 3: Add Simple Memory/State

Goal

Extend the task agent so it remembers a user preference during the session.

Example preference:

The user likes compact summaries.

This is not long-term memory in a database. It is just lightweight session state.

Why This Matters

Agents often need some working memory to adapt behavior across steps.

Code

import json
from openai import OpenAI

client = OpenAI()

TASKS = [
    {"id": 1, "title": "Write session notes", "done": False},
    {"id": 2, "title": "Review Python code", "done": False},
]

# Simple in-memory session state.
SESSION_STATE = {
    "summary_style": "compact"
}


def get_tasks():
    """Return all tasks."""
    return {"tasks": TASKS}


def complete_task(task_id):
    """Mark a task complete."""
    for task in TASKS:
        if task["id"] == task_id:
            task["done"] = True
            return {"success": True, "task": task}
    return {"success": False, "message": "Task not found"}


TOOLS = [
    {
        "type": "function",
        "name": "get_tasks",
        "description": "Get the current tasks.",
        "parameters": {
            "type": "object",
            "properties": {},
            "required": [],
            "additionalProperties": False,
        },
    },
    {
        "type": "function",
        "name": "complete_task",
        "description": "Mark a task as complete by ID.",
        "parameters": {
            "type": "object",
            "properties": {
                "task_id": {"type": "integer"}
            },
            "required": ["task_id"],
            "additionalProperties": False,
        },
    },
]


def run_tool(name, arguments):
    """Run local tool functions."""
    if name == "get_tasks":
        return get_tasks()
    if name == "complete_task":
        return complete_task(arguments["task_id"])
    return {"error": f"Unknown tool {name}"}


def run_agent(user_goal, session_state, max_turns=5):
    """
    Run the agent with a small amount of session memory.
    """
    system_prompt = (
        "You are a task agent. "
        f"The user's preferred summary style is: {session_state['summary_style']}. "
        "Use tools when needed. "
        "When finished, respond in the preferred style."
    )

    response = client.responses.create(
        model="gpt-5.4-mini",
        input=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_goal},
        ],
        tools=TOOLS,
    )

    for _ in range(max_turns):
        tool_outputs = []

        for item in response.output:
            if item.type == "function_call":
                args = json.loads(item.arguments)
                result = run_tool(item.name, args)

                tool_outputs.append(
                    {
                        "type": "function_call_output",
                        "call_id": item.call_id,
                        "output": json.dumps(result),
                    }
                )

        if not tool_outputs:
            return response.output_text

        response = client.responses.create(
            model="gpt-5.4-mini",
            previous_response_id=response.id,
            input=tool_outputs,
            tools=TOOLS,
        )

    return "The agent could not complete the task in time."


if __name__ == "__main__":
    user_goal = "Complete task 2 and summarize the result."
    result = run_agent(user_goal, SESSION_STATE)
    print(result)

Example Output

Task 2 completed successfully. Summary: "Review Python code" is now done.

Discussion

This is a simple example of working memory:

The user preference is stored separately
It is injected into the agent’s context
The final response changes based on remembered state

12. Common Failure Modes in Agents

12.1 Infinite or Wasteful Loops

The model keeps asking for tools when it should stop.

Mitigation: - Limit steps - Add stronger system instructions - Detect repeated calls

12.2 Wrong Tool Selection

The model picks an inappropriate tool.

Mitigation: - Improve tool descriptions - Reduce overlapping tools - Add validation and fallback logic

12.3 Hallucinated Assumptions

The model invents facts instead of checking tools.

Mitigation: - Instruct it to verify task state before acting - Prefer grounded workflows - Require tool use for sensitive operations

12.4 Bad Arguments

The model sends malformed or incomplete inputs.

Mitigation: - Use strict JSON schemas - Validate in Python - Return clear error results

12.5 Premature Completion

The model says it is done before actually finishing the task.

Mitigation: - Ask for explicit verification - Build completion checks into the loop - Return status flags from tools

13. Practical Mental Model

A useful mental model is:

LLM = reasoning engine
Tools = actions/sensors
Python app = controller
State store = memory
Loop = agent behavior

The model should not be treated as the whole application.

Instead, your application should orchestrate the model safely and deliberately.

14. Mini Lab

Challenge

Build a small reading-list agent.

It should support these tools:

get_books()
mark_book_read(book_id)

User Goal

“Find the unread Python-related book, mark it as read, and summarize the result.”

Suggested Data

BOOKS = [
    {"id": 1, "title": "Python Crash Course", "read": False},
    {"id": 2, "title": "Deep Learning Basics", "read": False},
    {"id": 3, "title": "Effective Testing in Python", "read": False},
]

Stretch Goal

Add a user preference:

"summary_style": "bullets"

Then ask the agent to format the result accordingly.

15. Recap

In this session, you learned that:

A chatbot mainly responds to prompts.
An agent is goal-oriented and can take actions.
Tool use is a major step from chatbot to agent.
A control loop enables multi-step task completion.
State and memory help agents behave more intelligently.
Agentic systems need safeguards like validation, logging, and loop limits.

Useful Resources

Suggested Homework

Modify the agent loop so it logs all tool calls to a list and prints them at the end.
Add a new tool called create_task(title) and let the agent create tasks from user requests.
Prevent duplicate completions by returning a clear “already done” status.
Extend session memory with:
preferred summary style
preferred verbosity
Build a small CLI version where the user can interactively give goals to the agent.

End-of-Session Checkpoint

You are ready for the next session if you can:

Describe the difference between chatbot and agent in one sentence.
Explain what a tool call is.
Write a loop that lets the model call tools multiple times.
Add a small piece of state or memory to influence agent behavior.

Back to Chapter | Back to Master Plan | Previous Session | Next Session