Skip to content

Session 3: Implementing Function Calling Workflows

Synopsis

Shows how to connect model decisions to actual Python functions and external APIs. Learners build action loops in which the model selects tools, receives results, and continues toward a goal.

Session Content

Session 3: Implementing Function Calling Workflows

Session Overview

In this session, you will learn how to let an LLM decide when to call Python functions, how to define tools safely, and how to build a complete function-calling loop using the OpenAI Responses API with gpt-5.4-mini.

By the end of this session, you will be able to:

  • Explain what function calling is and when to use it
  • Define tool schemas for Python functions
  • Build a function-calling workflow with the Responses API
  • Execute model-requested tools and return results back to the model
  • Add validation, error handling, and safety checks
  • Implement a multi-step tool workflow in Python

Agenda (~45 minutes)

  1. Why function calling matters — 5 min
  2. Core concepts: tools, schemas, and execution loops — 10 min
  3. Hands-on Exercise 1: Single-tool workflow — 10 min
  4. Hands-on Exercise 2: Multi-tool workflow — 12 min
  5. Safety, validation, and design best practices — 5 min
  6. Wrap-up and next steps — 3 min

1. Why Function Calling Matters

LLMs are excellent at understanding language and planning steps, but they do not inherently have access to your Python code, databases, APIs, or business systems unless you explicitly connect them.

Function calling solves this by allowing the model to:

  • Recognize when external information or computation is needed
  • Select an appropriate tool
  • Provide structured arguments for that tool
  • Incorporate the tool result into its final answer

Common Use Cases

  • Looking up weather, stock, or inventory data
  • Querying internal systems
  • Performing calculations
  • Triggering workflows such as booking, sending, or scheduling
  • Combining multiple tools in sequence

Typical Pattern

A function-calling workflow usually looks like this:

  1. User sends a request
  2. Model decides whether a tool is needed
  3. Model emits a tool call with structured arguments
  4. Your Python code executes the tool
  5. Tool result is sent back to the model
  6. Model generates the final answer

This pattern is the foundation of many agentic systems.


2. Core Concepts: Tools, Schemas, and Execution Loops

2.1 What Is a Tool?

A tool is a function your application exposes to the model. The model does not execute Python directly. Instead, it requests a tool call, and your application runs the function.

Examples:

  • get_weather(city)
  • convert_currency(amount, from_currency, to_currency)
  • search_products(query)
  • create_support_ticket(subject, priority)

2.2 Tool Schema

A tool schema describes:

  • Tool name
  • Tool purpose
  • Input parameters
  • Required fields
  • Expected types

The model uses this schema to generate valid arguments.

Example schema:

{
  "type": "function",
  "name": "get_weather",
  "description": "Get the current weather for a city.",
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "The city name, for example London"
      }
    },
    "required": ["city"],
    "additionalProperties": false
  }
}

2.3 Execution Loop

A function-calling app typically needs a loop:

  • Send user input and tool definitions to the model
  • Check whether the model returned tool calls
  • Execute each tool
  • Return tool outputs to the model
  • Repeat until the model returns a final answer

This loop is one of the most important implementation patterns in agentic development.


2.4 Best Practices for Tool Design

Good tools are:

  • Focused: Each tool does one thing well
  • Explicit: Clear names and descriptions
  • Structured: Strong parameter definitions
  • Safe: Input validation before execution
  • Observable: Log tool calls and results
  • Deterministic when possible: Easier to test and debug

3. Hands-on Exercise 1: Single-Tool Workflow

Goal

Build a small assistant that can answer weather questions by calling a Python function.

What You Will Learn

  • How to define a tool
  • How to send tool definitions to the model
  • How to detect tool calls in the Responses API output
  • How to execute the tool and continue the response

Step 1: Install Dependencies

pip install openai python-dotenv

Step 2: Set Up Environment Variables

Create a .env file:

OPENAI_API_KEY=your_api_key_here

Step 3: Single-Tool Example

"""
Session 3 - Exercise 1
Single-tool function calling workflow using the OpenAI Responses API.

This example demonstrates:
1. Defining a tool schema
2. Sending it to the model
3. Detecting tool calls
4. Executing the Python function
5. Returning the tool result to the model for a final answer
"""

import json
from dotenv import load_dotenv
from openai import OpenAI

# Load environment variables from .env
load_dotenv()

# Create the OpenAI client
client = OpenAI()


def get_weather(city: str) -> dict:
    """
    Simulated weather lookup function.

    In a real application, this would call an external weather API.
    We return mock data here so the example is easy to run.
    """
    weather_db = {
        "london": {"city": "London", "temperature_c": 14, "condition": "Cloudy"},
        "paris": {"city": "Paris", "temperature_c": 18, "condition": "Sunny"},
        "tokyo": {"city": "Tokyo", "temperature_c": 22, "condition": "Rainy"},
    }

    return weather_db.get(
        city.strip().lower(),
        {
            "city": city,
            "temperature_c": "unknown",
            "condition": "No data available",
        },
    )


# Tool definition passed to the model
tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "Name of the city to get weather for."
                }
            },
            "required": ["city"],
            "additionalProperties": False
        }
    }
]

# A user question that likely requires a tool call
user_question = "What's the weather like in Paris today?"

# First model call: allow the model to decide whether to use a tool
response = client.responses.create(
    model="gpt-5.4-mini",
    input=user_question,
    tools=tools
)

# Print raw response output items for learning/debugging purposes
print("=== First response output items ===")
for item in response.output:
    print(item)

# Collect tool call outputs to send back to the model
tool_outputs = []

for item in response.output:
    # We are interested in function tool calls emitted by the model
    if item.type == "function_call":
        tool_name = item.name
        arguments = json.loads(item.arguments)

        print("\n=== Tool call requested by the model ===")
        print(f"Tool name: {tool_name}")
        print(f"Arguments: {arguments}")

        if tool_name == "get_weather":
            result = get_weather(arguments["city"])

            print("\n=== Tool execution result ===")
            print(result)

            # Send the tool result back in the required structured format
            tool_outputs.append(
                {
                    "type": "function_call_output",
                    "call_id": item.call_id,
                    "output": json.dumps(result)
                }
            )

# If the model requested a tool, continue the conversation with tool outputs
if tool_outputs:
    final_response = client.responses.create(
        model="gpt-5.4-mini",
        previous_response_id=response.id,
        input=tool_outputs
    )

    print("\n=== Final assistant answer ===")
    print(final_response.output_text)
else:
    # If no tool call happened, print the model's direct response
    print("\n=== Assistant answer (no tool needed) ===")
    print(response.output_text)

Example Output

=== First response output items ===
ResponseFunctionToolCall(arguments='{"city":"Paris"}', call_id='call_abc123', name='get_weather', type='function_call', id='fc_123')

=== Tool call requested by the model ===
Tool name: get_weather
Arguments: {'city': 'Paris'}

=== Tool execution result ===
{'city': 'Paris', 'temperature_c': 18, 'condition': 'Sunny'}

=== Final assistant answer ===
The weather in Paris is currently sunny and 18°C.

Exercise Tasks

  1. Change the user question to ask about London and rerun the script.
  2. Add two more cities to weather_db.
  3. Ask a question that does not need a tool, such as:
  4. "What is a weather forecast?"
  5. Observe whether the model chooses to call the tool or answer directly.

Key Learning Points

  • The model chooses whether to call the tool
  • Tool calls are returned as structured output items
  • Your application executes the tool, not the model
  • The previous_response_id lets you continue the workflow cleanly

4. Hands-on Exercise 2: Multi-Tool Workflow

Goal

Build an assistant that can both look up product prices and calculate discounted totals.

What You Will Learn

  • How to define multiple tools
  • How the model chooses between tools
  • How to support multi-step tool usage
  • How to build a reusable tool execution loop

Scenario

The user asks:

"What is the discounted price of a laptop if it costs 1200 dollars and there is a 15% discount?"

The model may decide to:

  1. Use a price or product lookup tool
  2. Use a calculation tool
  3. Return the final answer

Multi-Tool Example

"""
Session 3 - Exercise 2
Multi-tool function calling workflow with a reusable execution loop.

This example demonstrates:
1. Multiple tool definitions
2. Repeated tool execution until the model is done
3. Validation and safe tool dispatch
"""

import json
from typing import Any, Dict
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI()


def get_product_price(product_name: str) -> dict:
    """
    Return a mock product price.
    In a real system, this might query a database or product API.
    """
    price_db = {
        "laptop": {"product_name": "laptop", "price": 1200, "currency": "USD"},
        "mouse": {"product_name": "mouse", "price": 25, "currency": "USD"},
        "keyboard": {"product_name": "keyboard", "price": 75, "currency": "USD"},
    }
    return price_db.get(
        product_name.strip().lower(),
        {"product_name": product_name, "price": None, "currency": "USD"}
    )


def calculate_discount(price: float, discount_percent: float) -> dict:
    """
    Calculate the discounted price from an original price and percentage.
    """
    discount_amount = price * (discount_percent / 100)
    final_price = price - discount_amount
    return {
        "original_price": price,
        "discount_percent": discount_percent,
        "discount_amount": round(discount_amount, 2),
        "final_price": round(final_price, 2),
    }


tools = [
    {
        "type": "function",
        "name": "get_product_price",
        "description": "Look up the current price of a product.",
        "parameters": {
            "type": "object",
            "properties": {
                "product_name": {
                    "type": "string",
                    "description": "The product name, such as laptop or mouse."
                }
            },
            "required": ["product_name"],
            "additionalProperties": False
        }
    },
    {
        "type": "function",
        "name": "calculate_discount",
        "description": "Calculate the final price after applying a percentage discount.",
        "parameters": {
            "type": "object",
            "properties": {
                "price": {
                    "type": "number",
                    "description": "The original price before discount."
                },
                "discount_percent": {
                    "type": "number",
                    "description": "The discount percentage, such as 15 for 15%."
                }
            },
            "required": ["price", "discount_percent"],
            "additionalProperties": False
        }
    }
]


def execute_tool(tool_name: str, arguments: Dict[str, Any]) -> dict:
    """
    Safely dispatch a tool call to the correct Python function.
    Raises a ValueError for unknown tools.
    """
    if tool_name == "get_product_price":
        return get_product_price(arguments["product_name"])
    elif tool_name == "calculate_discount":
        return calculate_discount(arguments["price"], arguments["discount_percent"])
    else:
        raise ValueError(f"Unknown tool: {tool_name}")


def run_agentic_tool_loop(user_input: str) -> str:
    """
    Run a complete tool-calling loop until the model returns a final answer.
    """
    response = client.responses.create(
        model="gpt-5.4-mini",
        input=user_input,
        tools=tools
    )

    while True:
        tool_outputs = []

        print("\n=== Model output items ===")
        for item in response.output:
            print(item)

        # Find and execute any tool calls
        for item in response.output:
            if item.type == "function_call":
                tool_name = item.name
                arguments = json.loads(item.arguments)

                print("\nTool requested:")
                print(f"- Name: {tool_name}")
                print(f"- Arguments: {arguments}")

                try:
                    result = execute_tool(tool_name, arguments)
                except Exception as exc:
                    result = {"error": str(exc)}

                print("Tool result:")
                print(result)

                tool_outputs.append(
                    {
                        "type": "function_call_output",
                        "call_id": item.call_id,
                        "output": json.dumps(result)
                    }
                )

        # If no tool calls were made, the model is done
        if not tool_outputs:
            return response.output_text

        # Continue the response with the tool outputs
        response = client.responses.create(
            model="gpt-5.4-mini",
            previous_response_id=response.id,
            input=tool_outputs
        )


if __name__ == "__main__":
    user_prompt = "What is the discounted price of a laptop if there is a 15% discount?"
    final_answer = run_agentic_tool_loop(user_prompt)

    print("\n=== Final answer ===")
    print(final_answer)

Example Output

=== Model output items ===
ResponseFunctionToolCall(arguments='{"product_name":"laptop"}', call_id='call_1', name='get_product_price', type='function_call', id='fc_1')

Tool requested:
- Name: get_product_price
- Arguments: {'product_name': 'laptop'}
Tool result:
{'product_name': 'laptop', 'price': 1200, 'currency': 'USD'}

=== Model output items ===
ResponseFunctionToolCall(arguments='{"price":1200,"discount_percent":15}', call_id='call_2', name='calculate_discount', type='function_call', id='fc_2')

Tool requested:
- Name: calculate_discount
- Arguments: {'price': 1200, 'discount_percent': 15}
Tool result:
{'original_price': 1200, 'discount_percent': 15, 'discount_amount': 180.0, 'final_price': 1020.0}

=== Final answer ===
The laptop costs $1200, and after a 15% discount, the final price is $1020.

Exercise Tasks

  1. Modify the prompt to ask about a mouse with a 20% discount.
  2. Add a new product to the price database.
  3. Ask for a product that does not exist and inspect the behavior.
  4. Extend the code with a new tool named calculate_tax(price, tax_percent).

5. Safety, Validation, and Design Best Practices

Function calling is powerful, but it must be implemented carefully.

5.1 Validate Arguments

Even if the schema is good, always validate in Python before executing the tool.

Examples:

  • Check that required keys exist
  • Check that numbers are in valid ranges
  • Reject unknown values
  • Sanitize strings before using them in external systems

Example validation helper:

def validate_discount_args(arguments: dict) -> None:
    """
    Validate inputs before calling calculate_discount.
    """
    if "price" not in arguments or "discount_percent" not in arguments:
        raise ValueError("Missing required arguments.")

    if arguments["price"] < 0:
        raise ValueError("Price cannot be negative.")

    if not (0 <= arguments["discount_percent"] <= 100):
        raise ValueError("Discount percent must be between 0 and 100.")

5.2 Keep Tool Descriptions Clear

Bad description:

  • "Gets stuff"

Good description:

  • "Look up the current price of a product by product name."

The model performs better when tool names and descriptions are specific.


5.3 Avoid Dangerous Direct Actions

Be careful with tools that:

  • Send emails
  • Delete records
  • Transfer money
  • Execute shell commands
  • Update databases

For sensitive tools, add:

  • Confirmation steps
  • Authorization checks
  • Audit logging
  • Rate limiting
  • Human approval when needed

5.4 Return Structured Results

Tool outputs should be easy for the model to interpret.

Better:

{"status": "success", "price": 1200, "currency": "USD"}

Worse:

"Yep, it looks like it costs around twelve hundred bucks or something."

5.5 Log Every Tool Call

Track:

  • Which tool was called
  • Arguments used
  • Result returned
  • Errors raised
  • Time taken

This makes debugging and evaluation much easier.


6. Mini Challenge

Build a tool-calling assistant for a simple travel planner.

Requirements

Create these tools:

  • get_flight_price(origin, destination)
  • get_hotel_price(city, nights)
  • calculate_trip_total(flight_price, hotel_price)

User Prompt Example

What would be the total cost for flying from New York to Boston and staying in a hotel in Boston for 3 nights?

Suggested Approach

  1. Define each Python function
  2. Create tool schemas
  3. Build a reusable loop like in Exercise 2
  4. Return a final natural-language answer

7. Common Pitfalls

Pitfall 1: Assuming the Model Executes Code

It does not. Your application must detect and execute tool calls.

Pitfall 2: Skipping Validation

Never trust tool arguments blindly, even when the schema looks correct.

Pitfall 3: Making Tools Too Broad

Avoid “mega-tools” that do many unrelated things.

Pitfall 4: Not Handling Unknown Tools

Always guard your dispatcher with explicit tool names and fallback errors.

Pitfall 5: Forgetting Multi-Step Loops

Some requests require more than one tool call. A single-step implementation may fail.


8. Wrap-Up

In this session, you learned how to:

  • Define function tools for LLMs
  • Use the OpenAI Responses API for tool calling
  • Execute tool calls from Python
  • Continue a response with function_call_output
  • Build reusable tool-calling loops
  • Apply validation and safety best practices

Function calling is a core building block for agentic systems because it enables models to interact with real application logic and external systems in a structured way.


Useful Resources

  • OpenAI Responses API guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
  • OpenAI API docs: https://platform.openai.com/docs
  • OpenAI Python SDK: https://github.com/openai/openai-python
  • JSON Schema reference: https://json-schema.org/learn/getting-started-step-by-step
  • Python json module: https://docs.python.org/3/library/json.html
  • Python dotenv: https://pypi.org/project/python-dotenv/

Suggested Homework

  1. Add validation logic to Exercise 2.
  2. Extend the multi-tool example with a tax calculator.
  3. Build a small CLI assistant that supports:
  4. product lookups
  5. discounts
  6. tax calculations
  7. Log all tool calls to a file for debugging.
  8. Try prompts that require:
  9. one tool
  10. multiple tools
  11. no tool at all

Quick Recap

  • Function calling lets models request structured tool use
  • Tool schemas guide the model
  • Python executes the tool, not the model
  • Multi-step workflows need an execution loop
  • Safety and validation are essential for production-quality systems

Back to Chapter | Back to Master Plan | Previous Session | Next Session