Skip to content

Session 2: Designing Callable Functions for Agents

Synopsis

Covers how to define clean tool interfaces, parameters, outputs, and validation rules in Python. Learners understand how good tool design makes agent behavior more interpretable and controllable.

Session Content

Session 2: Designing Callable Functions for Agents

Session Overview

In this session, learners will move from basic prompt-driven interactions to tool-augmented agent design by learning how to define callable functions that an LLM can use safely and effectively. The focus is on designing clear, constrained, and reliable function interfaces that work well with agentic systems.

By the end of this session, learners will be able to:

  • Explain why agents use callable functions/tools
  • Design function schemas that are easy for models to use correctly
  • Implement function calling with the OpenAI Responses API
  • Validate and execute tool calls in Python
  • Build a small multi-function workflow that demonstrates practical agent behavior

Learning Objectives

After this session, learners should be able to:

  1. Describe the role of tools/functions in agentic applications
  2. Distinguish good and bad function interface design
  3. Create JSON-schema-based tool definitions for the Responses API
  4. Parse and execute tool calls from model responses
  5. Add validation, constraints, and safe defaults to function execution
  6. Build a simple agent loop that supports multiple callable tools

Suggested Timing (~45 Minutes)

  • 5 min — Why agents need callable functions
  • 10 min — Principles of good function design
  • 10 min — Using tools with the OpenAI Responses API
  • 15 min — Hands-on exercises
  • 5 min — Recap and discussion

1. Why Agents Need Callable Functions

Large language models are strong at reasoning over text, but many tasks require interaction with external systems or deterministic logic, such as:

  • Looking up structured data
  • Performing calculations
  • Sending notifications
  • Querying APIs
  • Accessing local business rules
  • Triggering workflows

A callable function gives the model a controlled way to request an action.

Examples of agent tools

  • get_weather(city)
  • search_products(query, max_results)
  • create_support_ticket(customer_id, issue_summary)
  • calculate_shipping(weight_kg, destination_country)
  • get_order_status(order_id)

Why not let the model “just answer”?

Because some tasks need:

  • Fresh data
  • Deterministic logic
  • External side effects
  • Safety constraints
  • Auditable execution paths

Tool calling separates:

  • Model reasoning
  • System execution

This is a core idea in agentic development.


2. Principles of Good Callable Function Design

Designing tools for agents is not the same as designing general-purpose Python functions for humans. The model is the caller, so interfaces should be optimized for clarity, predictability, and safety.

2.1 Keep functions narrowly scoped

Good tools do one thing well.

Good

  • get_weather(city, unit)
  • send_email(to, subject, body)

Less good

  • handle_customer_request(data_blob)

The more ambiguous the tool, the harder it is for the model to call correctly.


2.2 Use explicit parameter names

Parameter names should be descriptive and unambiguous.

Better

  • destination_country
  • customer_id
  • include_tax

Worse

  • dest
  • id
  • flag

2.3 Add strong descriptions

Descriptions help the model know when and how to use the tool.

A good tool definition includes:

  • What the function does
  • What each parameter means
  • Allowed values where relevant
  • When the tool should be used

2.4 Constrain inputs

Use schema constraints whenever possible.

Examples:

  • Enumerated values: "celsius" or "fahrenheit"
  • Required fields
  • Type constraints
  • Reasonable defaults in execution logic

Constraints reduce invalid calls and improve reliability.


2.5 Avoid hidden assumptions

Bad design: - Parameter meaning depends on internal company knowledge - Ambiguous units - Vague formats

Good design: - Clearly specify units, formats, and expectations

Example: - weight_kg instead of weight - travel_date_iso instead of date


2.6 Separate retrieval from action

A useful design pattern:

  • Retrieval tools: read information
  • Action tools: perform side effects

Examples:

  • get_account_balance(account_id) → retrieval
  • transfer_funds(from_account, to_account, amount) → action

This separation supports safer agent behavior.


2.7 Validate before execution

Even if the model provides parameters, your code must still validate them.

Always check:

  • Required fields exist
  • Values are valid
  • Numeric ranges are safe
  • Sensitive actions are gated

LLM output should never bypass application validation.


3. Tool Definitions with the OpenAI Responses API

In the Responses API, tools can be provided so the model can decide when to call them.

A tool definition includes:

  • type
  • name
  • description
  • parameters as a JSON Schema object

Example tool schema

weather_tool = {
    "type": "function",
    "name": "get_weather",
    "description": "Get the current weather for a city. Use this when the user asks about weather conditions or temperature.",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "The city name, for example 'Paris' or 'Bengaluru'."
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "Temperature unit to return."
            }
        },
        "required": ["city", "unit"],
        "additionalProperties": False
    }
}

Design notes

This schema is good because:

  • It names the function clearly
  • It describes when to use the function
  • It constrains unit to valid values
  • It rejects extra unexpected parameters

4. Basic Function Calling Flow

A typical function-calling workflow looks like this:

  1. Send user input and tool definitions to the model
  2. The model decides whether to call a tool
  3. Your application reads the requested tool call
  4. Your code validates and executes the function
  5. You send the tool result back to the model
  6. The model produces a final user-facing response

4.1 Example: Single tool call

Python example

import json
from openai import OpenAI

# Create the OpenAI client.
# Ensure OPENAI_API_KEY is set in your environment before running.
client = OpenAI()

# Define a callable tool that the model can use.
weather_tool = {
    "type": "function",
    "name": "get_weather",
    "description": "Get the current weather for a city. Use this when the user asks about weather conditions or temperature.",
    "parameters": {
        "type": "object",
        "properties": {
            "city": {
                "type": "string",
                "description": "The city name."
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "The temperature unit."
            }
        },
        "required": ["city", "unit"],
        "additionalProperties": False
    }
}

def get_weather(city: str, unit: str) -> dict:
    """
    Mock weather function for educational purposes.

    In a real application, this would call a weather API or internal service.
    """
    sample_data = {
        "Paris": {"celsius": 18, "fahrenheit": 64},
        "Tokyo": {"celsius": 24, "fahrenheit": 75},
        "Nairobi": {"celsius": 27, "fahrenheit": 81},
    }

    if city not in sample_data:
        return {
            "city": city,
            "unit": unit,
            "temperature": None,
            "condition": "unknown",
            "message": f"No weather data available for {city}."
        }

    return {
        "city": city,
        "unit": unit,
        "temperature": sample_data[city][unit],
        "condition": "sunny"
    }

# First request: allow the model to decide whether to call the tool.
response = client.responses.create(
    model="gpt-5.4-mini",
    input="What's the weather in Paris in celsius?",
    tools=[weather_tool]
)

# Inspect the response items to find a function call.
tool_call = None
for item in response.output:
    if item.type == "function_call" and item.name == "get_weather":
        tool_call = item
        break

if tool_call is None:
    print("Model did not request a tool call.")
else:
    # Parse the model-provided JSON arguments safely.
    args = json.loads(tool_call.arguments)

    # Validate required fields before execution.
    city = args["city"]
    unit = args["unit"]

    # Execute the local Python function.
    tool_result = get_weather(city=city, unit=unit)

    # Send the function result back to the model using the previous_response_id.
    follow_up = client.responses.create(
        model="gpt-5.4-mini",
        previous_response_id=response.id,
        input=[
            {
                "type": "function_call_output",
                "call_id": tool_call.call_id,
                "output": json.dumps(tool_result)
            }
        ]
    )

    print(follow_up.output_text)

Example output

The current weather in Paris is sunny and 18°C.

5. Validating Tool Calls Safely

Tool calling is powerful, but execution must be controlled.

5.1 Common risks

  • Missing required fields
  • Unexpected extra parameters
  • Invalid enum values
  • Dangerous side effects
  • Function misuse due to vague schemas

5.2 Safe validation pattern

Before executing a tool:

  1. Confirm the tool name is allowed
  2. Parse arguments safely
  3. Validate required fields
  4. Check types and value ranges
  5. Return structured errors when validation fails

5.3 Example: Validation wrapper

import json

def validate_get_weather_args(arguments_json: str) -> dict:
    """
    Validate arguments for the get_weather tool.

    Returns a normalized dictionary if valid.
    Raises ValueError if invalid.
    """
    try:
        args = json.loads(arguments_json)
    except json.JSONDecodeError as exc:
        raise ValueError(f"Arguments are not valid JSON: {exc}") from exc

    required_fields = ["city", "unit"]
    for field in required_fields:
        if field not in args:
            raise ValueError(f"Missing required field: {field}")

    if not isinstance(args["city"], str) or not args["city"].strip():
        raise ValueError("Field 'city' must be a non-empty string.")

    if args["unit"] not in {"celsius", "fahrenheit"}:
        raise ValueError("Field 'unit' must be 'celsius' or 'fahrenheit'.")

    return {
        "city": args["city"].strip(),
        "unit": args["unit"]
    }

# Example usage
raw_args = '{"city": "Paris", "unit": "celsius"}'
validated = validate_get_weather_args(raw_args)
print(validated)

Example output

{'city': 'Paris', 'unit': 'celsius'}

6. Designing Better Tools: Good vs Bad Examples

Example A: Too generic

{
    "type": "function",
    "name": "process_request",
    "description": "Handle the user's request.",
    "parameters": {
        "type": "object",
        "properties": {
            "data": {"type": "string"}
        },
        "required": ["data"]
    }
}

Problems

  • Unclear purpose
  • Ambiguous parameter
  • Hard for model to use reliably
  • Difficult to validate and debug

Example B: Better design

{
    "type": "function",
    "name": "calculate_shipping_cost",
    "description": "Calculate shipping cost for a parcel based on weight and destination country.",
    "parameters": {
        "type": "object",
        "properties": {
            "weight_kg": {
                "type": "number",
                "description": "Parcel weight in kilograms."
            },
            "destination_country": {
                "type": "string",
                "description": "Country where the parcel will be shipped."
            },
            "delivery_speed": {
                "type": "string",
                "enum": ["standard", "express"],
                "description": "Requested delivery speed."
            }
        },
        "required": ["weight_kg", "destination_country", "delivery_speed"],
        "additionalProperties": False
    }
}

Why this is better

  • One clear job
  • Strong field names
  • Constrained input space
  • Easier to test and validate

7. Hands-On Exercise 1: Build a Simple Calculator Tool

Goal

Create a callable calculator function that an agent can use for arithmetic questions.

Skills practiced

  • Tool schema design
  • Tool execution
  • Validation
  • Returning function results to the model

Step-by-step code

import json
from openai import OpenAI

client = OpenAI()

calculator_tool = {
    "type": "function",
    "name": "calculate",
    "description": "Perform a basic arithmetic operation on two numbers.",
    "parameters": {
        "type": "object",
        "properties": {
            "a": {
                "type": "number",
                "description": "The first numeric input."
            },
            "b": {
                "type": "number",
                "description": "The second numeric input."
            },
            "operator": {
                "type": "string",
                "enum": ["add", "subtract", "multiply", "divide"],
                "description": "The arithmetic operation to perform."
            }
        },
        "required": ["a", "b", "operator"],
        "additionalProperties": False
    }
}

def calculate(a: float, b: float, operator: str) -> dict:
    """
    Deterministic calculator function with simple validation.
    """
    if operator == "add":
        result = a + b
    elif operator == "subtract":
        result = a - b
    elif operator == "multiply":
        result = a * b
    elif operator == "divide":
        if b == 0:
            return {"error": "Division by zero is not allowed."}
        result = a / b
    else:
        return {"error": f"Unsupported operator: {operator}"}

    return {
        "a": a,
        "b": b,
        "operator": operator,
        "result": result
    }

response = client.responses.create(
    model="gpt-5.4-mini",
    input="What is 84 divided by 7?",
    tools=[calculator_tool]
)

tool_call = None
for item in response.output:
    if item.type == "function_call" and item.name == "calculate":
        tool_call = item
        break

if tool_call is None:
    print("No calculator tool call was made.")
else:
    args = json.loads(tool_call.arguments)
    tool_result = calculate(
        a=args["a"],
        b=args["b"],
        operator=args["operator"]
    )

    follow_up = client.responses.create(
        model="gpt-5.4-mini",
        previous_response_id=response.id,
        input=[
            {
                "type": "function_call_output",
                "call_id": tool_call.call_id,
                "output": json.dumps(tool_result)
            }
        ]
    )

    print(follow_up.output_text)

Example output

84 divided by 7 is 12.

Exercise task

Modify the calculator tool to support:

  • power
  • modulo

Reflection questions

  • Which inputs should be constrained?
  • What happens if the user asks for unsupported operations?
  • Should the function or the model handle formatting of the final answer?

8. Hands-On Exercise 2: Multi-Tool Agent with Routing

Goal

Build an agent that can choose between multiple tools based on the user’s request.

We will use two tools:

  • get_weather
  • calculate

This introduces a core agentic pattern: tool routing.


Full example

import json
from openai import OpenAI

client = OpenAI()

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name."
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit."
                }
            },
            "required": ["city", "unit"],
            "additionalProperties": False
        }
    },
    {
        "type": "function",
        "name": "calculate",
        "description": "Perform a basic arithmetic operation on two numbers.",
        "parameters": {
            "type": "object",
            "properties": {
                "a": {"type": "number", "description": "First number."},
                "b": {"type": "number", "description": "Second number."},
                "operator": {
                    "type": "string",
                    "enum": ["add", "subtract", "multiply", "divide"],
                    "description": "Arithmetic operation."
                }
            },
            "required": ["a", "b", "operator"],
            "additionalProperties": False
        }
    }
]

def get_weather(city: str, unit: str) -> dict:
    """
    Mock weather implementation for teaching.
    """
    data = {
        "London": {"celsius": 16, "fahrenheit": 61},
        "Mumbai": {"celsius": 31, "fahrenheit": 88}
    }

    if city not in data:
        return {
            "error": f"No weather data found for city '{city}'."
        }

    return {
        "city": city,
        "temperature": data[city][unit],
        "unit": unit,
        "condition": "partly cloudy"
    }

def calculate(a: float, b: float, operator: str) -> dict:
    """
    Basic calculator implementation.
    """
    operations = {
        "add": lambda x, y: x + y,
        "subtract": lambda x, y: x - y,
        "multiply": lambda x, y: x * y,
        "divide": lambda x, y: x / y if y != 0 else None
    }

    if operator == "divide" and b == 0:
        return {"error": "Division by zero is not allowed."}

    if operator not in operations:
        return {"error": f"Unsupported operator '{operator}'."}

    return {
        "a": a,
        "b": b,
        "operator": operator,
        "result": operations[operator](a, b)
    }

tool_registry = {
    "get_weather": get_weather,
    "calculate": calculate
}

user_query = "What is 15 multiplied by 6?"

response = client.responses.create(
    model="gpt-5.4-mini",
    input=user_query,
    tools=tools
)

tool_outputs = []

for item in response.output:
    if item.type == "function_call":
        tool_name = item.name
        args = json.loads(item.arguments)

        if tool_name not in tool_registry:
            result = {"error": f"Unknown tool '{tool_name}'."}
        else:
            try:
                result = tool_registry[tool_name](**args)
            except TypeError as exc:
                result = {"error": f"Invalid arguments for tool '{tool_name}': {exc}"}

        tool_outputs.append(
            {
                "type": "function_call_output",
                "call_id": item.call_id,
                "output": json.dumps(result)
            }
        )

if not tool_outputs:
    print("No tool was called.")
else:
    follow_up = client.responses.create(
        model="gpt-5.4-mini",
        previous_response_id=response.id,
        input=tool_outputs
    )
    print(follow_up.output_text)

Example output

15 multiplied by 6 is 90.

Exercise task

Try these prompts:

  • What is the weather in London in fahrenheit?
  • What is 100 divided by 4?
  • What's the weather in Berlin?
  • Multiply 11 by 13

Reflection questions

  • How does the model decide which tool to call?
  • What happens if the tool returns an error?
  • Why is it useful to keep tool outputs structured?

9. Hands-On Exercise 3: Add a Tool Dispatcher with Validation

Goal

Create a reusable dispatcher that:

  • Accepts tool calls from the model
  • Validates tool names
  • Parses arguments
  • Executes the correct Python function
  • Returns structured outputs

This is a step toward a more production-like agent loop.


Code example

import json
from typing import Any, Callable
from openai import OpenAI

client = OpenAI()

def get_weather(city: str, unit: str) -> dict:
    """
    Return mock weather data.
    """
    weather_db = {
        "Paris": {"celsius": 20, "fahrenheit": 68},
        "Sydney": {"celsius": 26, "fahrenheit": 79}
    }

    if city not in weather_db:
        return {"error": f"Unknown city: {city}"}

    return {
        "city": city,
        "temperature": weather_db[city][unit],
        "unit": unit
    }

def calculate(a: float, b: float, operator: str) -> dict:
    """
    Return arithmetic results with basic protection.
    """
    if operator == "add":
        return {"result": a + b}
    if operator == "subtract":
        return {"result": a - b}
    if operator == "multiply":
        return {"result": a * b}
    if operator == "divide":
        if b == 0:
            return {"error": "Division by zero."}
        return {"result": a / b}

    return {"error": f"Unsupported operator: {operator}"}

tool_registry: dict[str, Callable[..., dict[str, Any]]] = {
    "get_weather": get_weather,
    "calculate": calculate
}

tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name."},
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit."
                }
            },
            "required": ["city", "unit"],
            "additionalProperties": False
        }
    },
    {
        "type": "function",
        "name": "calculate",
        "description": "Perform arithmetic with two numbers.",
        "parameters": {
            "type": "object",
            "properties": {
                "a": {"type": "number", "description": "First number."},
                "b": {"type": "number", "description": "Second number."},
                "operator": {
                    "type": "string",
                    "enum": ["add", "subtract", "multiply", "divide"],
                    "description": "Operation to perform."
                }
            },
            "required": ["a", "b", "operator"],
            "additionalProperties": False
        }
    }
]

def dispatch_tool_call(tool_name: str, arguments_json: str) -> dict:
    """
    Validate and execute a tool call safely.

    Returns a structured result dictionary.
    """
    if tool_name not in tool_registry:
        return {"error": f"Unknown tool: {tool_name}"}

    try:
        args = json.loads(arguments_json)
    except json.JSONDecodeError as exc:
        return {"error": f"Invalid JSON arguments: {exc}"}

    try:
        result = tool_registry[tool_name](**args)
    except TypeError as exc:
        return {"error": f"Argument mismatch for tool '{tool_name}': {exc}"}
    except Exception as exc:
        return {"error": f"Tool execution failed: {exc}"}

    return result

response = client.responses.create(
    model="gpt-5.4-mini",
    input="What's the weather in Sydney in celsius?",
    tools=tools
)

tool_outputs = []

for item in response.output:
    if item.type == "function_call":
        result = dispatch_tool_call(item.name, item.arguments)
        tool_outputs.append(
            {
                "type": "function_call_output",
                "call_id": item.call_id,
                "output": json.dumps(result)
            }
        )

if tool_outputs:
    final_response = client.responses.create(
        model="gpt-5.4-mini",
        previous_response_id=response.id,
        input=tool_outputs
    )
    print(final_response.output_text)
else:
    print("The model responded without using tools.")

Example output

The current weather in Sydney is 26°C.

Exercise task

Enhance dispatch_tool_call to:

  • Reject unknown extra parameters
  • Log each tool invocation
  • Return consistent error objects such as:
  • {"status": "error", "message": "..."}
  • Return successful results as:
  • {"status": "ok", "data": ...}

10. Best Practices Checklist

Use this checklist when designing callable functions for agents.

Tool design

  • Keep tools focused on one responsibility
  • Use explicit, descriptive names
  • Add strong descriptions for tool and parameters
  • Constrain values with enums where possible
  • Reject additional unexpected properties

Execution safety

  • Validate all model-generated inputs
  • Never trust tool arguments blindly
  • Handle exceptions gracefully
  • Separate read-only tools from side-effect tools
  • Add authentication/authorization for real actions

Developer experience

  • Return structured JSON-like results
  • Keep outputs predictable
  • Add logging for tool calls
  • Test tools independently from the model
  • Make failure states explicit

11. Common Mistakes

Mistake 1: Overloading one tool for many tasks

A single “do everything” tool causes confusion and invalid calls.

Mistake 2: Weak parameter descriptions

The model may guess incorrectly if fields are vague.

Mistake 3: No validation

Even with a schema, your application must validate inputs.

Mistake 4: Unstructured tool outputs

If the tool returns inconsistent data, the model has a harder time producing reliable final answers.

Mistake 5: Mixing user-facing formatting with business logic

Prefer returning structured data from the tool and letting the model present the final answer.


12. Mini Challenge

Design a new tool for a customer support agent.

Requirements

The tool should:

  • Create a support ticket
  • Require a customer ID
  • Require a short issue summary
  • Allow only these priorities:
  • low
  • medium
  • high

Your task

  1. Write the tool schema
  2. Implement the Python function
  3. Validate inputs
  4. Test it with a prompt like:

“Create a high-priority support ticket for customer CUST-1024 because the payment page crashes.”

Sample solution

support_tool = {
    "type": "function",
    "name": "create_support_ticket",
    "description": "Create a support ticket for a customer issue.",
    "parameters": {
        "type": "object",
        "properties": {
            "customer_id": {
                "type": "string",
                "description": "The unique customer identifier, for example CUST-1024."
            },
            "issue_summary": {
                "type": "string",
                "description": "A short summary of the customer issue."
            },
            "priority": {
                "type": "string",
                "enum": ["low", "medium", "high"],
                "description": "The priority level of the support ticket."
            }
        },
        "required": ["customer_id", "issue_summary", "priority"],
        "additionalProperties": False
    }
}

def create_support_ticket(customer_id: str, issue_summary: str, priority: str) -> dict:
    """
    Mock support ticket creation.
    """
    if not customer_id.startswith("CUST-"):
        return {"error": "customer_id must start with 'CUST-'"}

    if not issue_summary.strip():
        return {"error": "issue_summary must not be empty."}

    if priority not in {"low", "medium", "high"}:
        return {"error": "Invalid priority."}

    return {
        "ticket_id": "TICK-9001",
        "customer_id": customer_id,
        "issue_summary": issue_summary,
        "priority": priority,
        "status": "created"
    }

Example output

Support ticket TICK-9001 was created for customer CUST-1024 with high priority.

13. Recap

In this session, you learned that callable functions are essential for agentic systems because they let models interact with deterministic logic and external systems in a controlled way.

You practiced:

  • Defining tool schemas with JSON Schema
  • Designing clear function interfaces
  • Executing tool calls using the OpenAI Responses API
  • Validating model-generated arguments
  • Building a simple multi-tool dispatcher

These skills are foundational for building robust GenAI agents.


Useful Resources

  • OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
  • OpenAI API docs: https://platform.openai.com/docs
  • OpenAI Python SDK: https://github.com/openai/openai-python
  • JSON Schema reference: https://json-schema.org/understanding-json-schema/
  • Python json module docs: https://docs.python.org/3/library/json.html

Suggested Homework

  1. Add a new tool called convert_temperature(value, from_unit, to_unit)
  2. Build a dispatcher that supports three or more tools
  3. Add structured logging for all tool calls
  4. Write unit tests for each tool function without calling the model
  5. Refactor one tool so that its schema is clearer and more constrained

Quick Knowledge Check

  1. Why should tools be narrowly scoped?
  2. Why is weight_kg better than weight?
  3. What does additionalProperties: False help prevent?
  4. Why should tool outputs be structured?
  5. Why must application code validate arguments even if the schema exists?

End of Session

In the next session, learners can build on this foundation by creating more capable agent loops, handling multi-step workflows, and introducing memory or state into agent behavior.


Back to Chapter | Back to Master Plan | Previous Session | Next Session