Session 3: Designing Structured Outputs

Synopsis

Shows how to request predictable responses such as JSON, tabular data, or schema-aligned outputs. Learners begin designing machine-readable LLM responses suitable for downstream automation and agent workflows.

Session Content

Session 3: Designing Structured Outputs

Session Overview

In this session, learners will move from “free-form text generation” to reliable, machine-usable outputs. This is a foundational skill for building production-grade GenAI systems and agentic workflows. Instead of asking a model to “just answer,” we will design prompts and response formats so the model returns data that can be parsed, validated, and used directly in Python applications.

By the end of this session, learners will be able to:

Explain why structured outputs matter in real applications
Design prompts that encourage predictable JSON responses
Use the OpenAI Responses API with gpt-5.4-mini to generate structured data
Validate and parse model outputs safely in Python
Build small workflows that depend on structured model responses
Recognize common failure modes and improve prompt reliability

Learning Objectives

After this session, you should be able to:

Define structured outputs and explain their value in software systems
Compare unstructured natural language responses vs. schema-like JSON responses
Write prompts that specify output shape, constraints, and field expectations
Parse model-generated JSON safely in Python
Validate structured outputs with lightweight Python checks
Design robust fallback strategies when model output is malformed or incomplete

Session Agenda (~45 minutes)

0–5 min: Introduction to structured outputs
5–15 min: Theory: why structure matters and common design patterns
15–25 min: Prompt design for structured JSON outputs
25–35 min: Hands-on Exercise 1: Generate structured product summaries
35–42 min: Hands-on Exercise 2: Build a simple support ticket classifier
42–45 min: Wrap-up and recap

1. Why Structured Outputs Matter

Large language models are very good at generating human-readable text. But software systems often need responses in a format that Python code can consume automatically.

Example: Unstructured output

If you ask:

“Summarize this customer review and tell me the sentiment.”

You might get:

“The customer liked the battery life but was disappointed by the slow charging speed. Overall sentiment is mixed.”

This is readable, but hard to process programmatically.

Example: Structured output

A better response for an application might be:

{
  "summary": "Customer liked battery life but disliked slow charging.",
  "sentiment": "mixed",
  "priority": "medium"
}

This can be:

parsed into a Python dictionary
stored in a database
used in a dashboard
fed into another system or agent

Benefits of structured outputs

Structured outputs enable:

Reliability: predictable fields and formats
Automation: direct use in downstream code
Validation: check presence and types of fields
Interoperability: integrate with APIs, UIs, and workflows
Agentic orchestration: one model step can produce inputs for the next

2. Common Structured Output Patterns

There are several useful output patterns when working with LLMs.

2.1 Flat JSON objects

Best for small tasks like classification, extraction, scoring, and summaries.

Example:

{
  "category": "billing",
  "urgency": "high",
  "requires_human": true
}

2.2 Lists of objects

Useful when extracting multiple items from text.

Example:

{
  "action_items": [
    {
      "task": "Email client",
      "owner": "Ava",
      "deadline": "2026-03-25"
    },
    {
      "task": "Prepare budget draft",
      "owner": "Ravi",
      "deadline": "2026-03-27"
    }
  ]
}

2.3 Nested JSON

Useful for richer responses, but should be used carefully. The more complex the output, the more likely formatting issues become.

Example:

{
  "document": {
    "title": "Q1 Planning Notes",
    "summary": "Budget and hiring priorities were discussed."
  },
  "risks": [
    "Hiring delays",
    "Vendor cost increases"
  ]
}

2.4 Enumerated values

If a field should come from a limited set, explicitly state allowed values.

Example:

sentiment: positive, negative, mixed, neutral
urgency: low, medium, high

This improves consistency and simplifies validation.

3. Principles for Designing Good Structured Prompts

When asking for structured outputs, vague prompts produce vague data. Good prompts clearly define:

the expected format
the required fields
field meanings
allowed values
rules for missing information

3.1 Be explicit about the format

Bad:

“Return the result in a structured way.”

Better:

“Return valid JSON with exactly these keys: summary, sentiment, confidence.”

3.2 Define constraints

Example:

sentiment must be one of: positive, negative, mixed, neutral
confidence must be a float from 0 to 1
summary must be at most 30 words

3.3 Tell the model what to do when information is missing

Example:

“If the text does not contain enough information, use null for missing values.”

This is much better than allowing hallucinated values.

3.4 Avoid unnecessary complexity

Start with the smallest useful schema. Overly complex nested structures are harder to generate and validate.

3.5 Ask for JSON only

Example:

“Return only valid JSON. Do not include markdown, explanation, or surrounding text.”

This reduces parsing problems.

4. Example Prompt Template for Structured Outputs

A reusable template:

You are a data extraction assistant.

Analyze the input text and return only valid JSON.

Required schema:
{
  "field_1": "string",
  "field_2": "string",
  "field_3": "number | null"
}

Rules:
- Return only valid JSON
- Do not include markdown or commentary
- If a value is unknown, use null
- field_2 must be one of: "low", "medium", "high"
- field_1 should be concise and under 20 words

This kind of prompt works well for many extraction and classification tasks.

5. Using the OpenAI Responses API for Structured Output

We will use:

Python
OpenAI Python SDK
gpt-5.4-mini
Responses API

5.1 Install dependencies

pip install openai

5.2 Set your API key

export OPENAI_API_KEY="your_api_key_here"

On Windows PowerShell:

$env:OPENAI_API_KEY="your_api_key_here"

6. Hands-on Exercise 1: Product Review Summarizer with JSON Output

Goal

Convert free-form customer reviews into structured JSON containing:

short summary
sentiment
confidence
key issues list

What you will learn

How to write a structured-output prompt
How to call the Responses API
How to parse JSON safely
How to validate basic fields

Step 1: Create the Python script

import json
import os
from openai import OpenAI

# Initialize the OpenAI client using the API key from the environment.
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Example product review text to analyze.
review_text = """
I really like how long the battery lasts on this headset.
The sound quality is clear and comfortable for long meetings.
However, charging is much slower than I expected, and the case feels a bit flimsy.
"""

# Prompt instructing the model to return structured JSON only.
prompt = f"""
You are a product review analysis assistant.

Analyze the following customer review and return only valid JSON.

Required schema:
{{
  "summary": "string",
  "sentiment": "positive | negative | mixed | neutral",
  "confidence": "number between 0 and 1",
  "key_issues": ["string"]
}}

Rules:
- Return only valid JSON
- Do not include markdown or explanation
- summary must be at most 25 words
- sentiment must be one of: positive, negative, mixed, neutral
- confidence must be a number between 0 and 1
- key_issues should be an array of short strings
- If there are no issues, return an empty array

Customer review:
\"\"\"
{review_text}
\"\"\"
"""

# Send the request using the Responses API.
response = client.responses.create(
    model="gpt-5.4-mini",
    input=prompt
)

# The output_text property provides the model's text response.
raw_output = response.output_text

print("Raw model output:")
print(raw_output)

# Parse the JSON response safely.
try:
    parsed = json.loads(raw_output)
except json.JSONDecodeError as exc:
    print("\nFailed to parse JSON:", exc)
    raise

# Lightweight validation checks to ensure expected structure.
required_keys = {"summary", "sentiment", "confidence", "key_issues"}
missing_keys = required_keys - parsed.keys()

if missing_keys:
    raise ValueError(f"Missing required keys: {missing_keys}")

if parsed["sentiment"] not in {"positive", "negative", "mixed", "neutral"}:
    raise ValueError("Invalid sentiment value")

if not isinstance(parsed["confidence"], (int, float)):
    raise ValueError("Confidence must be numeric")

if not (0 <= parsed["confidence"] <= 1):
    raise ValueError("Confidence must be between 0 and 1")

if not isinstance(parsed["key_issues"], list):
    raise ValueError("key_issues must be a list")

print("\nValidated structured output:")
print(json.dumps(parsed, indent=2))

Example output

{
  "summary": "Customer praised battery and comfort but criticized slow charging and case quality.",
  "sentiment": "mixed",
  "confidence": 0.94,
  "key_issues": [
    "slow charging",
    "flimsy case"
  ]
}

Discussion

Notice how this output is much more useful than plain prose:

summary can be shown in a UI
sentiment can power analytics
confidence can support filtering or escalation
key_issues can be aggregated across many reviews

Mini Challenge

Modify the script to process three reviews in a loop and collect all parsed results into a Python list.

Suggested review samples:

reviews = [
    "The laptop is fast and lightweight, but the fan gets loud under load.",
    "Excellent camera quality and battery. Very happy with this phone.",
    "The app crashes often and syncing is unreliable. Frustrating experience."
]

7. Designing Safer Parsers and Validators

Even with a strong prompt, model output may occasionally be malformed or incomplete. In production, always validate.

Recommended validation steps

Parse with json.loads
Check required keys
Check data types
Check allowed enum values
Check length or range constraints
Handle failure gracefully

Common fallback strategies

retry with a stricter prompt
log invalid output for review
return a safe default
ask the model to fix invalid JSON in a second pass

8. Hands-on Exercise 2: Support Ticket Classifier

Goal

Build a small classifier that converts incoming support text into structured fields:

category
urgency
requires_human
short_reason

This is a common real-world GenAI pattern.

Step 1: Create the classifier script

import json
import os
from openai import OpenAI

# Initialize the client.
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

# Example support ticket submitted by a user.
ticket_text = """
I was charged twice for my subscription this month.
I already contacted support yesterday and have not received a refund.
Please resolve this as soon as possible.
"""

# Prompt specifying an exact schema and constraints.
prompt = f"""
You are a support ticket triage assistant.

Read the ticket and return only valid JSON.

Required schema:
{{
  "category": "billing | technical | account | shipping | other",
  "urgency": "low | medium | high",
  "requires_human": true,
  "short_reason": "string"
}}

Rules:
- Return only valid JSON
- Do not include markdown or explanation
- category must be one of: billing, technical, account, shipping, other
- urgency must be one of: low, medium, high
- requires_human must be a boolean
- short_reason must be at most 20 words

Ticket:
\"\"\"
{ticket_text}
\"\"\"
"""

# Call the Responses API.
response = client.responses.create(
    model="gpt-5.4-mini",
    input=prompt
)

raw_output = response.output_text

print("Raw model output:")
print(raw_output)

# Parse and validate the JSON output.
parsed = json.loads(raw_output)

allowed_categories = {"billing", "technical", "account", "shipping", "other"}
allowed_urgency = {"low", "medium", "high"}

if parsed["category"] not in allowed_categories:
    raise ValueError("Invalid category")

if parsed["urgency"] not in allowed_urgency:
    raise ValueError("Invalid urgency")

if not isinstance(parsed["requires_human"], bool):
    raise ValueError("requires_human must be boolean")

if not isinstance(parsed["short_reason"], str):
    raise ValueError("short_reason must be a string")

print("\nValidated ticket classification:")
print(json.dumps(parsed, indent=2))

Example output

{
  "category": "billing",
  "urgency": "high",
  "requires_human": true,
  "short_reason": "Duplicate charge and refund request need human follow-up."
}

Step 2: Extend to multiple tickets

You can process a batch of tickets in a loop.

import json
import os
from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

tickets = [
    "I can't log into my account even after resetting my password twice.",
    "My order was supposed to arrive last week and still hasn't been delivered.",
    "The mobile app freezes whenever I try to upload a profile picture."
]

for i, ticket in enumerate(tickets, start=1):
    prompt = f"""
You are a support ticket triage assistant.

Read the ticket and return only valid JSON.

Required schema:
{{
  "category": "billing | technical | account | shipping | other",
  "urgency": "low | medium | high",
  "requires_human": true,
  "short_reason": "string"
}}

Rules:
- Return only valid JSON
- Do not include markdown or explanation
- category must be one of: billing, technical, account, shipping, other
- urgency must be one of: low, medium, high
- requires_human must be a boolean
- short_reason must be at most 20 words

Ticket:
\"\"\"
{ticket}
\"\"\"
"""

    response = client.responses.create(
        model="gpt-5.4-mini",
        input=prompt
    )

    raw_output = response.output_text
    parsed = json.loads(raw_output)

    print(f"\nTicket #{i}")
    print(json.dumps(parsed, indent=2))

Example output

Ticket #1
{
  "category": "account",
  "urgency": "high",
  "requires_human": true,
  "short_reason": "User cannot access account after password resets."
}

Ticket #2
{
  "category": "shipping",
  "urgency": "medium",
  "requires_human": true,
  "short_reason": "Order delivery is overdue."
}

Ticket #3
{
  "category": "technical",
  "urgency": "medium",
  "requires_human": true,
  "short_reason": "App freezes during profile image upload."
}

9. Prompt Design Checklist for Structured Outputs

Use this checklist whenever you need machine-readable results.

Checklist

Specify the output format explicitly
Ask for JSON only
Define required keys
Define allowed values for enum-like fields
Define numeric ranges when needed
Specify behavior for missing data
Keep schemas as simple as possible
Validate in Python before using the result downstream

10. Common Mistakes

Mistake 1: Asking for “structured data” without defining structure

Too vague:

“Give me the result in a structured format.”

Better:

“Return valid JSON with keys category, urgency, and reason.”

Mistake 2: Not validating output

Even if the model usually behaves well, your application should not assume correctness.

Mistake 3: Overly large schemas

If you ask for 20 nested fields, reliability may decrease. Start small.

Mistake 4: No handling for missing information

Always define whether the model should use null, empty strings, or empty arrays.

Mistake 5: Mixing explanation with data

If you want parseable JSON, ask for JSON only.

11. From Structured Outputs to Agentic Workflows

Structured outputs are a key building block of agentic systems.

For example:

Step 1: classify user input into a category
Step 2: decide whether to use a tool
Step 3: extract parameters for the tool
Step 4: generate the final user-facing response

Each step depends on predictable outputs from the previous step. Without structure, chaining these steps becomes fragile.

Example workflow

A customer message:

“My package is late and I need it before Friday.”

Structured output from the model:

{
  "category": "shipping",
  "urgency": "high",
  "requires_tracking_lookup": true,
  "customer_deadline": "Friday"
}

Your Python code can now:

route to shipping support
trigger a tracking API lookup
prioritize the request

This is why structured outputs are central to practical AI systems.

12. Wrap-Up

In this session, you learned how to:

design prompts for structured outputs
request JSON-only responses
parse model output in Python
validate fields and constraints
use structured outputs in simple application workflows

This is one of the most important patterns in applied GenAI development. Once outputs become predictable, they can power reliable software systems rather than just chatbot-style interactions.

13. Practice Tasks

Try these after the session:

Build a meeting note extractor that returns:
summary
decisions
action items
owners
Build a resume screener that returns:
candidate_name
years_experience
key_skills
fit_score
Build a content moderation helper that returns:
risk_level
categories
recommended_action

For each project: - define a schema - prompt for JSON only - parse and validate in Python - test with multiple examples

Useful Resources

OpenAI Responses API migration guide: https://developers.openai.com/api/docs/guides/migrate-to-responses
OpenAI API docs: https://platform.openai.com/docs
OpenAI Python SDK: https://github.com/openai/openai-python
Python json module documentation: https://docs.python.org/3/library/json.html

Quick Recap

Structured outputs make LLM responses usable by software
JSON is a practical default format
Good prompts specify schema, rules, and constraints
Python validation is essential
Structured outputs are a foundation for agentic applications

End of Session

Back to Chapter | Back to Master Plan | Previous Session | Next Session